POLYPEPTIDES THAT BIND HIV gpl2 0 AND RELATED NUCLEIC ACIDS, 
ANTIBODIES, COMPOSITIONS, AND METHODS OF USE 

TECHNICAL FIELD OF THE INVENTION 
The present invention relates to polypeptides with 
homology to regions of domains of the human chemokine 
receptors CCR5, CXCR4 , and STRL33, as well as domains of 
CD4 that bind with human immunodeficiency virus (HIV) , in 
particular HIV-1 glycoprotein 12 0 (gp!2 0) envelope protein. 
The present invention also relates to nucleic acids 
encoding such polypeptides, antibodies, compositions 
comprising such polypeptides, nucleic acids or antibodies, 
and methods of using the same. 

BACKGROUND OF THE INVENTION 

There are seven transmembrane chemokine receptors that 
act as cof actors for HIV infection. The cof actors enable 
entry of HIV-1 into CD4 + T cells and macrophages (Premack et 
al., Nature Medicine 2: 1174-78 (1996); and Zhang et al . , 
Nature 383: 768 (1996)). 

The presence of chemokines has an inhibitory effect on 
HIV-1 attachment to, and infection of, susceptible cells. 
Additionally, some mutations in chemokine receptors have 
been shown to result in resistance to HIV-1 infection. For 
example, a 32 -nucleotide deletion within the CCR5 gene has 
been described in subjects who remained uninfected despite 
repeated exposures to HIV-1 (Huang et al . , Nature Medicine 
2 : 1240-43 (1996) ) . 

Evidence also exists for the physical association of a 
ternary complex between chemokine receptors, CD4 , and HIV-1 
gpl20 envelope glycoprotein on cell membranes (Lapham et 
al . , Science 274: 602-05 (1996)). Receptor signaling and 
cell activation are probably not required for the 
ant i -HIV-1 effect of chemokines since a RANTES analog 
lacking the first eight amino- terminal amino acids, RANTES 
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(9-68) , lacked chemotactic and leukocyte-activating 
properties, but bound to multiple chemokine receptors and 
inhibited infection by macrophage- tropic HIV-1 (Arenzana- 
Seladedos et al . , Nature 383: 400 (1996)). Cumulatively, 
5 the above described results suggest that the interaction 
between gpl2 0, CD4 , and at least one chemokine receptor is 
obligatory for HIV-1 infection. Accordingly, reagents that 
interfere with the binding of gpl2 0 to chemokine receptors 
and to CD4 are used in the biological and medical arts. 

10 However, there presently exists a need for additional 

reagents that can compete with one or more proteins of the 
gpl20-CD4 -chemokine receptor complex to assist in basic 
biological or viral research, and to assist in medical 
intervention in the HIV-1 pandemic. It is an object of the 

15 present invention to provide such reagents. This and other 
objects and advantages, including additional inventive 
features, will be apparent from the description provided 
herein . 

2 0 BRIEF SUMMARY OF THE INVENTION 

The present invention provides a polypeptide that 

binds with HIV gpl20 under physiological conditions. 

Multiple embodiments of the present inventive polypeptide 

are provided, and each embodiment possesses a degree of 
2 5 homology to at least one of the human CCR5 , CXCR4 and 

STRL33 chemokine receptors, and the human CD4 cell -surface 

protein. 

In a first embodiment, the present invention provides 
a polypeptide comprising the amino acid sequence YDIXYYXXE 

30 (SEQ ID NO: 1), wherein X is any synthetic or naturally 

occurring amino acid residue, and the polypeptide comprises 
less than about 100 contiguous amino acids that are 
identical to, or, in the alternative, substantially 
identical to, the amino acid sequence of the human CCR5 

35 chemokine receptor. A preferred polypeptide of this first 
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embodiment comprises the amino acid sequence YDIN*YYT*S*E 
(SEQ ID NO: 3) . A more preferred polypeptide of this first 
embodiment comprises the amino acid sequence YDINYYTSE (SEQ 
ID NO: 3) , wherein each letter is the standard one-letter 
5 abbreviation for an amino acid residue (i.e., for example, 
N denotes asparaginyl, T denotyes threoninyl, and S denotes 
serinyl) . The polypeptide of the first embodiment can 
comprise the amino acid sequence 

M*D*YQ*V*S*SP*IYDIN*YYT*S*E (SEQ ID NO: 5) . Preferably, 

10 the polypeptide comprises the amino acid sequence 
MDYQVSSPI YDINYYTSE (SEQ ID NO : 5) . 

In a second embodiment, the present invention provides 
a polypeptide comprising the amino acid sequence 
XEXIXIYXXXNYXXX (SEQ ID NO: 6), wherein X is any synthetic 

15 or naturally occurring amino acid and wherein said 

polypeptide comprises less than about 100 contiguous amino 
acid that are identical to or substantially identical to 
the amino acid sequence of the human CXCR4 chemokine 
receptor. The polypeptide can consist essentially of, or 

20 consist of, the sequence EXIXIYXXXNY (SEQ ID NO: 7) . 
Preferably, the polypeptide comprises the sequence 
M*EG*IS*IYT*S*D*NYT*E*E* . Preferably, 
M*EG*IS*IYT*S*D*NYT*E*E* is M*EGISIYTSDNYT*E*E* . 

In a third embodiment, the present invention provides 

25 a polypeptide comprising the amino acid sequence EHQAFLQFS 
(SEQ ID NO: 10), wherein said polypeptide comprises less 
than about 100 contiguous amino acids that are identical to 
or substantially identical to the amino acid sequence of 
the human STRL33 chemokine receptor. The polypeptide can 

30 consist essentially of, or consist of, the sequence 
EHQAFLQFS (SEQ ID NO: 10) . 

In a fourth embodiment, the present invention provides 
a polypeptide comprising at least a portion of an amino 
acid sequence selected from the group consisting of 

35 LPPLYSLVFIFGFVGNML (SEQ ID NO: 11), QWDFGNTMCQLLTGLYFIGFFS 



(SEQ ID NO: 12), SQYQFWKNFQTLKIVILG (SEQ ID NO: 13), 
APYNIVLLLNTFQEFFGLNNCS (SEQ ID NO: 14), and 
YAFVGEKFRNYLLVFFQK (SEQ ID NO: 15), wherein said 
polypeptide comprises less than about 100 contiguous amino 
5 acids that are identical to or substantially identical to 
the amino acid sequence of the human CCR5 chemokine 
receptor. 

In a fifth embodiment, the present invention provides 
a polypeptide comprising at least a portion of an amino 

10 acid sequence selected from the group consisting of 

LLLT I PDF I FANVS EADD (SEQ ID NO: 16), WFQFQHIMVGLILPGIV (SEQ 
ID NO: 17), and IDSFILLEIIKQGCEFEN (SEQ ID NO: 18), wherein 
said polypeptide comprises less than about 100 contiguous 
amino acids that are identical to or substantially 

15 identical to the amino acid sequence of the human CXCR4 
chemokine receptor . 

In a sixth embodiment, the present invention provides 
a polypeptide comprising at least a portion of an amino 
acid sequence selected from the group consisting of 

20 LVI S I FYHKLQSLTDVFL (SEQ ID NO: 19), PFWAYAG I HE WVFGQVMC (SEQ 
ID NO: 20), EAISTWLATQMTLGFFL (SEQ ID NO : 21), 
LTMI VCYSVI I KTLLHAG (SEQ ID NO: 22), MAVFLLTQMPFNLMKFIRSTHW 
(SEQ ID NO: 23), HWEYYAMTSFHYTIMVTE (SEQ ID NO: 24), 
ACLNP VL YAF VS LKFRKN (SEQ ID NO: 25) and SKTFSASHNVEATSMFQL 

25 (SEQ ID NO: 26), wherein said polypeptide comprises less 

than about 100 contiguous amino acids that are identical to 
or substantially identical to the amino acid sequence of 
the human STRL3 3 chemokine receptor. 

In a seventh embodiment, the present invention 

3 0 provides a polypeptide comprising at least a portion of an 
amino acid sequence selected from the group consisting of 
DTYICEVED (SEQ ID NO : 27), EEVQLLVFGLTANSD (SEQ ID NO: 28), 
THLLQGQSLTLTLES (SEQ ID NO: 29), and GEQVE F S FPLAFTVE (SEQ 
ID NO: 30), wherein said polypeptide comprises less than 

35 about 100 contiguous amino acids that are identical to or 
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substantially identical to the amino acid sequence of the 
human CD4 cell -surface protein. 

In the fourth to seventh embodiments, any selected 
portion of the polypeptide can comprise from 1 to about 6 
5 conservative amino acid substitutions. In an alternative, 
the polypeptide can be partially defined by an absence of a 
polypeptide sequence, outside the region of the portion 
selected from the foregoing sequences, that has five, or 
ten, contiguous amino acid residues that have a sequence 

10 that consists of an amino acid sequence that is identical 
to or substantially identical to the protein to which the 
polypeptide has homology (i.e., CCR5, CXCR4 , STRL33, or 
CD4) . In yet another alternative, the polypeptide can lack 
a sequence of five or ten contiguous amino acids which are 

15 identical to or substantially identical to the sequence of 
the protein with which the sequence has homology except 
that one or more conservatively or neutrally substituted 
amino acids replace part of the sequence of the protein to 
which the polypeptide has homology. Additionally, any 

2 0 embodiment of the present inventive polypeptide can also 
comprise a pharmaceutical ly acceptable substituent. 

Any embodiment of the present inventive polypeptide 
can be incorporated into a composition, which further 
comprises a carrier. Any suitable embodiment of the 

25 present inventive polypeptide can be encoded by a nucleic 
acid that can be expressed in a cell. In this regard, the 
present invention further provides a vector comprising such 
a nucleic acid. The nucleic acids and vectors also can be 
incorporated into a composition comprising a carrier. 

30 Additionally, the present invention provides a method 

of making an antibody to a polypeptide of the present 
invention. The present invention also provides a method of 
prophylactically or therapeutically treating an HIV 
infection in a mammal. 
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Additionally, the present invention provides an anti- 
idiotype antibody comprising an internal image of a 
portion of gpl2 0, as well as a method of selecting such an 
antibody. 

5 The present invention also provides a method of making 

an antibody to a portion of the gpl20 protein that binds 
with a portion of CCR5 , CXCR4 , STRL33, or CD4 , as well as 
the immunizing compound used to make the antibody, and the 
antibody itself. In another embodiment of the present 
10 invention, a method of removing HIV-1 from a bodily fluid 
is provided. 

BRIEF DESCRIPTION OF THE DRAWINGS 
Figure 1 depicts a listing of synthetic amino acids 
15 available (from Bachem, King of Prussia, PA) for 

incorporation into polypeptides of the present invention. 

DETAILED DESCRIPTION OF THE INVENTION 
The present invention provides a polypeptide that 

20 binds with gpl20 of HIV, in particular HIV-1, more 

particularly HIV-lu^, under physiological conditions. The 
polypeptide has a number of uses including, but not limited 
to, the use of the polypeptide to elucidate the mechanism 
by which HIV, such as HIV-1, attaches to and/or infects a 

25 particular cell, to induce an immune response in a mammal, 
in particular a human, to HIV, in particular HIV-1, and to 
inhibit the replication of HIV, in particular HIV-1, in an 
infected mammal, in particular a human. 

Multiple embodiments of the present inventive 

3 0 polypeptide are provided. Each embodiment of the 

polypeptide has a degree of homology to at least one of the 
human CCR5, CXCR4 and STRL33 chemokine receptors, or the 
human CD4 cell -surface protein. In each embodiment 
provided herein, a letter indicates the standard amino acid 

35 designated by that letter, and a letter followed directly 
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by an asterisk (*) preferably represents the amino acid 
represented by the letter (e.g., N represents asparaginyl 
and T represents threoninyl) , or a synthetic or naturally 
occurring conservative or neutral substitution therefor. 
5 Additionally; in accordance with convention, all amino acid 
sequences provided herein are given either from left to 
right, or top to bottom, such that the- first amino acid is 
amino -terminal and the last is carboxyl -terminal . The 
synthesis of polypeptides, either synthetically (i.e., 
10 chemically) or biologically, is within the skill in the 
art . 

It is within the skill of the ordinary artisan to 
select synthetic and naturally occurring amino acids that 
make conservative or neutral substitutions for any 

15 particular naturally occurring amino acids. The skilled 
artisan desirably will consider the context in which any 
particular amino acid substitution is made, in addition to 
considering the hydrophobicity or polarity of the side- 
chain, the general size of the side chain, and the pK value 

20 of side-chains with acidic or basic character under 

physiological conditions. For example, lysine, arginine, 
and histidine are often suitably substituted for each 
other, and more often arginine and lysine. As is known in 
the art, this is because all three amino acids have basic 

25 side chains, whereas the pK value for the side-chains of 

lysine and arginine are much closer to each other (about 10 
and 12) than to histidine (about 6) . Similarly, glycine, 
alanine, valine, leucine, and isoleucine are often suitably 
substituted for each other, with the proviso that glycine 

30 is frequently not suitably substituted for the other 

members of the group. This is because each of these amino 
acids are relatively hydrophobic when incorporated into a 
polypeptide, but glycine f s lack of an ot-carbon allows the 
phi and psi angles of rotation (around the ot-carbon) so 
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much conformational freedom that glycinyl residues can 
trigger changes in conformation or secondary structure that 
do not often occur when the other amino acids are 
substituted for each other. Other groups of amino acids 
5 frequently suitably substituted for each other include, but 
are not limited to, the group consisting of glutamic and 
aspartic acids; the group consisting of phenylalanine, 
tyrosine and tryptophan; and the group consisting of 
serine, threonine and, optionally, tyrosine. Additionally, 

10 the skilled artisan can readily group synthetic amino acids 
with naturally occurring amino acids. 

In the context of the present invention, a polypeptide 
is "substantially identical" to another polypeptide if it 
comprises at least about 80% identical amino acids. 

15 Desirably, at least about 50% of the non- identical amino 
acids are conservative or neutral substitutions. Also, 
desirably, the polypeptides differ in length (i.e., due to 
deletion mutations) by no more than about 10%. 

In a first embodiment, the present invention provides 

2 0 a polypeptide comprising the amino acid sequence YDIXYYXXE 
(SEQ ID NO: 1), wherein X is any synthetic or naturally 
occurring amino acid residue, and the polypeptide comprises 
less than about 100 contiguous amino acids, preferably less 
than about 50 amino acids, more preferably less than about 

25 25 amino acids, and yet more preferably less than about 13 
amino acids that are identical to, or, in the alternative, 
substantially identical to, the amino acid sequence of the 
human CCR5 chemokine receptor. 

Preferably, the polypeptide of the first embodiment 

30 comprises YDIXYYXXE (SEQ ID NO: 1) , wherein the amino 

moiety of the amino- terminal tyrosinyl residue is not bound 
to another amino acid residue via a peptidic bond, and the 
carboxyl moiety of the glutamyl residue is not bound to 
another amino acid residue via a peptidic bond. However, 

35 the polypeptide can consist essentially of YDIXYYXXE (SEQ 
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ID NO: 1) and, optionally, can be modified by one or more 
pharmaceutical ly acceptable substituents, such as, for 
example, t-boc or a saccharide. 

More particularly, the polypeptide comprises the amino 
5 acid sequence YDIN* YYT*S*E (SEQ ID NO: 3) . Preferably, N* 
is asparaginyl, T* is threoninyl, and S* is serinyl . 

The polypeptide of the first embodiment can comprise a 
dodecapeptide selected from the amino acid sequence 
M*D*YQ*V*S*SP*IYDIN*YYT*S*E (SEQ ID NO: 5) . More 
10 preferably, the polypeptide of the first embodiment 

comprises the amino acid sequence MDYQVSSPIYDINYYTSE (SEQ 
ID NO: 5) . 

In a second embodiment, the present invention provides 
a polypeptide comprising the amino acid sequence 

15 XEXIXIYXXXNYXXX (SEQ ID NO: 6), wherein X is any synthetic 
or naturally occurring amino acid, and the polypeptide 
comprises less than about 100 contiguous amino acids, 
preferably less than about 50 amino acids, and more 
preferably less than about 25 amino acids, that are 

20 identical to or substantially identical to the amino acid 
sequence of the human CXCR4 chemokine receptor. 
Optionally, the polypeptide consists essentially of, or 
consists of, the sequence EXIXIYXXXNY (SEQ ID NO: 7) . 

In a preferred polypeptide of this second embodiment, 

25 the polypeptide comprises the amino acid sequence 
M*EG*IS*IYT*S*D*NYT*E*E* . Preferably, 
M*EG*IS*IYT*S*D*NYT*E*E* is M*EGISI YTSDNYT*E*E* . 

In a third embodiment, the present invention provides 
a polypeptide comprising the amino acid sequence EHQAFLQFS, 

30 wherein the polypeptide comprises less than about 100 

contiguous amino acid residues, preferably less than about 
50 contiguous amino acid residues, more preferably less 
than about 2 5 contiguous amino acid residues, that are 
identical to or substantially identical to the amino acid 

3 5 sequence of the human STRL3 3 chemokine receptor. The 
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polypeptide can consist essentially of, or consist of, the 
sequence EHQAFLQFS . 

The first three embodiments of the present invention 
provide, among other things , polypeptides having 
5 substantial identity or identity to the amino- terminal 

regions of the chemokine receptors CCR5 , CXCR4 , and STRL3 3. 
These first three embodiments form a first group of 
embodiments of the present invention. The present 
invention also provides, in a second group of embodiments, 

10 polypeptides having substantial identity or identity to an 
internal region of the human chemokine receptors CCR5 , 
CXCR4, and STRL33, as well as to the leukocyte cell-surface 
protein CD4 . 

This second group of embodiments provides a 

15 polypeptide that binds with HIV gpl2 0 under physiological 

conditions and comprises at least a portion of or all of an 
amino acid sequence selected from the group consisting of 
LPPLYSLVFIFGFVGNML (SEQ ID NO: 11), QWDFGNTMCQLLTGLYF I GFFS 
(SEQ ID NO: 12), SQYQFWKNFQTLKIVILG (SEQ ID NO: 13), 

20 APYNIVLLLNTFQEFFGLNNCS (SEQ ID NO: 14), and 

YAFVGEKFRNYLLVFFQK (SEQ ID NO: 15) , wherein the polypeptide 
comprises less than about 100 amino acids that are 
identical to or substantially identical to the amino acid 
sequence of the human CCR5 chemokine receptor; or selected 

2 5 from the group consisting of LLLTI PDF I FANVSEADD (SEQ ID NO: 
16) (165-182), WFQFQHIMVGLILPGIV (SEQ ID NO: 17) (197- 
214), and IDSFILLEI IKQGCEFEN (SEQ ID NO: 18) (261-278), 
wherein the polypeptide comprises less than about 100 amino 
acids that are identical to or substantially identical to 

30 the amino acid sequence of the human CXCR4 chemokine 
receptor; or 

selected from the group consisting of 
LVI S I FYHKLQSLTDVFL (SEQ ID NO: 19) (53-70), 
PFWAYAGIHEWVFGQVMC (SEQ ID NO: 20) (85-102), 

35 EAISTWLATQMTLGFFL (SEQ ID NO: 21) (185-202), 
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LTMI VCYSVI I KTLLHAG (SEQ ID NO: 22) (205-222), 
MAVFLLTQMPFNLMKF I RSTHW (SEQ ID NO: 23) (237-258), 
HWE YYAMTS FH YT I MVTE (SEQ ID NO: 24) (257-274), 
ACLNPVL YAFVS LKFRKN (SEQ ID NO: 25) (281-298) and 
5 SKTFSASHNVEATSMFQL (SEQ ID NO: 26) (325-342), wherein the 
polypeptide comprises less than about 100 amino acids that 
are identical to a substantially identical to the amino 
acid sequence of the human STRL3 3 chemokine receptor; or 
selected from the group consisting of DTYICEVED (SEQ 

10 ID NO: 27), EEVQLLVFGLTANSD (SEQ ID NO: 28), 

THLLQGQ S LTLTLE S (SEQ ID NO: 29), and GEQVEFSFPLAFTVE (SEQ 
ID NO: 30), wherein the polypeptide binds with HIV gpl20 
under physiological conditions and comprises less than 
about 100 amino acids that , are identical to or 

15 substantially identical to the amino acid sequence of the 
human CD4 cell -surface protein. Optionally, the recited 
amino acid sequences can comprise 1 to about 6 conservative 
or neutral amino acid substitutions. 

The polypeptides of this second group of embodiments 

2 0 preferably comprise less than about 50 amino acid residues, 
and more preferably less than about 2 5 amino acid residues, 
and yet more preferably no additional amino acid residues, 
that are identical to a protein that naturally has the 
recited amino acid sequence. The polypeptide can be 

25 alternatively characterized by an absence of a region, 

outside the above-recited amino acid sequences, that has 
about five, or about ten, contiguous amino acid residues 
that have a sequence that consists of an amino identical 
and conservatively substituted residues as an amino acid 

30 sequence of the protein to which the polypeptide of the 
compound has homology. 

Any embodiment of the present inventive polypeptide 
can also comprise a pharmaceut ically acceptable 
substituent, attachment of which is within the skill in the 

35 art. The pharmaceut ically acceptability of substituents 
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are understood by those skilled in the art. For example, a 
pharmaceutically acceptable substituent can be a 
biopolymer, such as a polypeptide, an RNA, a DNA, or a 
polysaccharide. Suitable polypeptides comprise fusion 
5 proteins, an antibody or fragment thereof, a cell adhesion 
molecule or a fragment thereof, or a peptide hormone. 
Suitable polysaccharides comprise polyglucose moieties, 
such as starch and their derivatives, such as heparin. The 
pharmaceutically acceptable substituent also can be any 

10 suitable lipid or lipid-containing moiety, such as a lipid 
of a liposome or a vesicle, or even a lipophilic moiety, 
such as a prostaglandin, a steroid hormone, or a derivative 
thereof. Additionally, the pharmaceutically acceptable 
substituent can be a nucleotide or nucleoside, such as 

15 nicotine adenine dinucleotide or thymine, an amino acid 
residue, a saccharide or disaccharide , or the residue of 
another biomolecule naturally occurring in a cell, such as 
inositol, a vitamin, such as vitamin C, thiamine, or 
nicotinic acid. Synthetic organic moieties also can be 

20 pharmaceutically acceptable substituents, such as t-butyl 
carbonyl, an acetyl moiety, quinine, polystyrene and other 
biologically acceptable polymers. Optionally, a 
pharmaceutically acceptable substituent can be selected 
from the group consisting of a Ci-Ci 8 alkyl, a C 2 -Ci 8 

25 alkenyl, a C 2 -Ci 8 alkynyl, a C 6 -Ci 8 aryl , a C 7 -Ci 8 alkaryl, a 
C 7 -Ci 8 aralkyl, and a C 3 -Ci 8 cycloalkyl , wherein any of the 
foregoing moieties that are cyclic comprise from 0 to 2 
atoms per carbocyclic ring, which can be the same or 
different, and are selected from the group consisting of 

30 nitrogen, oxygen, and sulfur. 

Any of the substituents from this group can be 
substituted by one to six substituent moieties, which can 
be the same or different, selected from the group 
consisting of an amino moiety, a carbamate moiety, a 

35 carbonate moiety, hydroxyl, a phosphamate moiety, a 
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phosphate moiety, a phosphonate moiety, a pyrophosphate 
moiety, a triphosphate moiety, a sulfamate moiety, a 
sulfate moiety, a sulfonate moiety, a Ci-C 8 monoalkylamine 
moiety, a C X -C Q dialkylamine moiety, and a Ci-C 8 
5 trialkylamine moiety. 

Any embodiment of the present inventive polypeptide 
can be encoded by a nucleic acid and can be expressed in a 
cell. The skilled artisan will recognize that the encoded 
polypeptide as well as any pharmaceutical ly acceptable 

10 substituent to be incorporated into the polypeptide, e.g., 
a formyl or acetyl substituent on an amino- terminal 
methionine or a saccharide, will preferably be produced by 
a cell that can express the polypeptide of the present 
invention. Accordingly, the amino acids incorporated into 

15 the polypeptide encoded by the nucleic acid are preferably 
naturally occurring. 

A nucleic acid as described above can be cloned into 
any suitable vector and can be used to transduce, 
transform, or transfect any suitable host. The selection 

2 0 of vectors and methods to construct them are commonly known 
to persons of ordinary skill in the art and are described 
in general technical references (see, in general, 
"Recombinant DNA Part D, " Methods in Enzymology, Vol. 153, 
Wu and Grossman, eds . , Academic Press (1987)). Desirably, 

2 5 the vector comprises regulatory sequences, such as 

transcription and translation initiation and termination 
codons, which are specific to the type of host (e.g., 
bacterium, fungus, plant, or animal) into which the vector 
is to be inserted, as appropriate and taking into 

30 consideration whether the vector is DNA or RNA. 

Preferably, the vector comprises regulatory sequences that 
are specific to the genus of the host. Most preferably, 
the vector comprises regulatory sequences that are specific 
to the species of the host and is optimized for the 

35 expression of an above -described polypeptide. 
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Constructs of vectors, which are circular or linear, 
can be prepared to contain an entire nucleic acid sequence 
as described above or a portion thereof ligated to a 
replication system that is functional in a prokaryotic or 
5 eukaryotic host cell. Replication systems can be derived 
from ColEl, 2 m|a plasmid, X, SV40, bovine papilloma virus, 
and the like. 

Suitable vectors include those designed for 
propagation and expansion, or for expression, or both. A 

10 preferred cloning vector is selected from the group 
consisting of the pUC series, the pBluescript series 
(Stratagene, LaJolla, CA) , the pET series (Novagen, 
Madison, WI) , the pGEX series (Pharmacia Biotech, Uppsala, 
Sweden) , and the pEX series (Clonetech, Palo Alto, CA) . 

15 Examples of animal expression vectors include pEUK-Cl, pMAM 
and pMAMneo (Clonetech, Palo Alto, CA) . 

An expression vector can comprise a native or 
nonnative promoter operably linked to a nucleic acid 
molecule encoding an above -described polypeptide. The 

20 selection of promoters, e.g., strong, weak, inducible, 

tissue-specific and developmental -specific, is within the 
skill in the art. Similarly, the combining of a nucleic 
acid molecule as described above with a promoter is also 
within the skill in the art. 

25 The skilled artisan will also recognize that the 

polypeptide has ability to bind the gpl20 protein, which is 
most often found outside of cells. Accordingly, the 
present inventive nucleic acid advantageously can comprise 
a nucleic acid sequence that encodes a signal sequence such 

3 0 that a signal sequence is translated as a fusion protein 
with the polypeptide of the present inventive polypeptide 
to form a signal sequence -polypeptide fusion. The signal 
sequence can cause secretion of the entire polypeptide, 
including the signal sequence (which is a pharmaceutically 

35 acceptable substituent) , or can be cleaved from the 
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polypeptide (i.e., the polypeptide of the compound) prior 
to, or during, secretion so that at least the present 
inventive polypeptide is secreted out of a cell in which 
the nucleic acid is expressed. 
5 Alternatively, the nucleic acid comprises or encodes 

an antisense nucleic acid molecule or a ribozyme that is 
specific for a specified amino acid sequence of an above- 
described polypeptide. A nucleic acid sequence introduced 
in antisense suppression generally is substantially 

10 identical to at least a portion of the endogenous gene or 

gene to be repressed, but need not be identical. Thus, the 
vectors can be designed such that the inhibitory effect 
applies to other proteins within a family of genes 
exhibiting homology or substantial homology to the target 

15 gene. The introduced sequence also need not be full-length 
relative to either of the primary transcription product or 
the fully processed mRNA. Generally, higher homology can 
be used to compensate for the use of a shorter sequence . 
Furthermore, the introduced sequence need not have the same 

20 intron or exon pattern, and homology of non-coding segments 
will be equally effective. 

Ribozymes also have been reported to have use as a 
means to inhibit expression of endogenous genes. It is 
possible to design ribozymes that specifically pair with 

25 virtually any target RNA and cleave the phosphodiester 
backbone at a specific location, thereby functionally 
inactivating the target RNA. In carrying out this 
cleavage, the ribozyme is not itself altered and is, thus, 
capable of recycling and cleaving other molecules, making 

30 it a true enzyme. The inclusion of ribozyme sequences 

within antisense RNAs confers RNA-cleaving activity upon 
them, thereby increasing the activity of the constructs. 
The design and use of target RNA-specific ribozymes is 
described in Haseloff et al . , Nature 334: 585-591 (1988). 
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Further provided by the present invention is a 
composition comprising an above -described polypeptide or 
nucleic acid and a carrier therefor. Another composition 
provided by the present invention is a composition 
5 comprising an antibody to an above -described polypeptide or 
an ant i- antibody to an above -described polypeptide. 

Any embodiment of the present invention including the 
present inventive polypeptide, nucleic acid, antibody, and 
ant i- antibody, cafa be incorporated into a composition 

10 comprising a carrier. The carrier can serve any function. 
For example, the carrier can increase the solubility of the 
present inventive polypeptide, nucleic acid or antibody in 
aqueous solutions. Additionally, the carrier can protect 
the present inventive polypeptide, nucleic acid or antibody 

15 from environmental insults, such as dehydration, oxidation, 
and photolysis. Moreover, the carrier can serve as an 
adjuvant, or as a timed-release control means in a 
biological system. 

Antibodies can be generated in accordance with methods 

20 known in the art. See, for example, Benjamin, In 

Immunology: a short course, Wiley-Liss, NY, 1996, pp. 436- 
437; Kuby, In Immunology, 3rd. ed. , Freeman, NY, 1997, pp. 
455-456; Greenspan et al . , FASEB J. 7: 437-443 (1993); and 
Poskitt, Vaccine 9: 792-796 (1991). Anti -antibodies (i.e., 

25 anti-idiotypic antibodies) also can be generated in 

accordance with methods known in the art (see, for example, 
Benjamin, In Immunology: a short course, Wiley-Liss, NY, 
1996, pp. 436-437; Kuby, In Immunology, 3rd. ed. , Freeman, 
NY, 1997, pp. 455-456; Greenspan et al . , FASEB J. , 7, 437- 

30 443, 1993; Poskitt, Vaccine , 9, 792-796, 1991; and 

Madiyalakan et al . , Hybridonor 14: 199-203 (1995) ("Anti- 
idiotype induction therapy")). Such antibodies can be 
obtained and employed either in solution-phase or coupled 
to a desired solid-phase matrix. Having in hand such 

35 antibodies, one skilled in the art will further appreciate 
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that such antibodies, using well-established procedures 
(e.g., such as described by Harlow and Lane (1988, supra ) , 
are useful in the detection, quantification, or 
purification of gp!20 or HIV, particularly HIV-1, 
5 conjugates of each and host cells transformed to produce a 
gpl20 receptor or a derivative thereof. Such antibodies 
are also useful in a method of prevention or treatment of a 
viral infection and in a method of inducing an immune 
response to HIV as provided herein. 

10 In view of the above, an above -described polypeptide 

can be administered to an animal. The animal generates 
ant i -polypeptide antibodies. Among the ant i -polypeptide 
antibodies generated or induced in the animal are 
antibodies that have an internal image of gpl20. In 

15 accordance with well-known methods, polyclonal or 

monoclonal antibodies can be obtained, isolated and 
selected. Selection of an ant i -polypeptide antibody that 
has an internal image of gpl2 0 can be based upon 
competition between the ant i -polypeptide antibody and gpl2 0 

20 for binding to an above -described polypeptide, or upon the 
ability of the anti-polypeptide antibody to bind to a free 
polypeptide as opposed to a polypeptide bound to gpl20. 
Such an ant i -antibody can be administered to an animal to 
prevent or treat an HIV infection in accordance with 

2 5 methods provided herein. 

Although nonhuman anti- idiotypic antibodies, such as 
an anti-polypeptide antibody that has an internal image of 
gpl20 and, therefore, is anti -idiotypic to gpl2 0, are 
useful for prophylaxis in humans, their favorable 

3 0 properties might, in certain instances, can be further 

enhanced and/or their adverse properties further 
diminished, through "humanization" strategies, such as 
those recently reviewed by Vaughan, Nature Biotech. , 16, 
535-539, 1998. 
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Prior to administration to an animal, such as a 
mammal, in particular a human, an above-described 
polypeptide, nucleic acid, antibody or anti -antibody can be 
formulated into various compositions by combination with 
5 appropriate carriers, in particular, pharmaceutically 

acceptable carriers or diluents, and can be formulated to 
be appropriate for either human or veterinary applications. 

The present invention also provides a method of making 
an antibody. The method comprises administering an 

10 immunogenic amount of an above -described polypeptide or 

nucleic acid to an animal, such as a mammal, in particular 
a human. Determining the quantity of a polypeptide or 
nucleic acid that is immunogenic will depend in part on the 
degree of similarity to a protein or other molecule of the 

15 inoculated animal, the route of administration of the 
polypeptide or nucleic acid, and the size of the 
polypeptide administered or encoded by the administered 
nucleic acid. If necessary, the polypeptide or nucleic 
acid can be mixed with or ligated to a substance (or an 

2 0 adjuvant) that enhances its immunogenicity . Such 

calculations and procedures are within the skill of the 
ordinary artisan. Additionally, the present inventive 
method preferably can be used to induce an immune response 
against HIV, particularly HIV-1, in a mammal, particularly 

2 5 a human. 

In view of the above, the present invention further 
provides a method of prophylactically or therapeutically 
treating an HIV infection in a mammal, particularly a 
human, in need thereof. The method comprises administering 

30 to the mammal an HIV replication-inhibiting effective 

amount of an above -described polypeptide, nucleic acid, or 
an anti -antibody to an above -described polypeptide or a 
nucleic acid encoding such a polypeptide. 

The present invention also provides a method of 

35 prophylactically or therapeutically treating HIV infection 
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in a mammal. The method comprises administering to the 
mammal an effective amount of an above -described 
polypeptide or nucleic acid. Prior to administration to an 
animal, such as a mammal, in particular a human, an above- 
5 described polypeptide or nucleic acid can be formulated 
into various compositions by combination with appropriate 
carriers, in particular, pharmaceutical ly acceptable 
carriers or diluents, and can be formulated to be 
appropriate for either human or veterinary applications. 

10 Thus, a composition for use in the method of the 

present invention can comprise one or more of the 
polypeptides, nucleic acids, antibodies or ant i -antibodies 
described herein, preferably in combination with a 
pharmaceutically acceptable carrier. Pharmaceutically 

15 acceptable carriers are well-known to those skilled in the 
art, as are suitable methods of administration. The choice 
of carrier will be determined, in part, by whether a 
polypeptide or a nucleic acid is to be administered, as 
well as by the particular method used to administer the 

20 composition. Optionally, the carrier can be selected to 
increase the solubility of the composition or mixture, 
e.g., a liposome or polysaccharide. One skilled in the art 
will also appreciate that various routes of administering a 
composition are available, and, although more than one 

25 route can be used for administration, a particular route 
can provide a more immediate and more effective reaction 
than another route. Accordingly, there are a wide variety 
of suitable formulations of compositions that can be used 
in the present inventive methods. 

3 0 A composition in accordance with the present 

invention, alone or in further combination with one or more 
other active agents, can be made into a formulation 
suitable for parenteral administration, preferably 
intraperitoneal administration. Such a formulation can 

3 5 include aqueous and nonaqueous, isotonic sterile injection 
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solutions, which can contain antioxidants, buffers, 
bacteriostats, and solutes that render the formulation 
isotonic with the blood of the intended recipient, and 
aqueous and nonaqueous sterile suspensions that can include 
5 suspending agents, solubilizers, thickening agents, 

stabilizers, and preservatives. The formulations can be 
presented in unit dose or multi-dose sealed containers, 
such as ampules and vials, and can be stored in a freeze- 
dried (lyophilized) condition requiring only the addition 
10 of the sterile liquid carrier, for example, water, for 
injections, immediately prior to use. Extemporaneously 
injectable solutions and suspensions can be prepared from 
sterile powders, granules, and tablets, as described 
herein. 

15 A formulation suitable for oral administration can 

consist of liquid solutions, such as an effective amount of 
the compound dissolved in diluents, such as water, saline, 
or fruit juice; capsules, sachets or tablets, each 
containing a predetermined amount of the active ingredient, 

20 as solid or granules; solutions or suspensions in an 

aqueous liquid; and oil-in-water emulsions or water-in-oil 
emulsions. Tablet forms can include one or more of 
lactose, mannitol, corn starch, potato starch, 
microcrystalline cellulose, acacia, gelatin, colloidal 

25 silicon dioxide, croscarmellose sodium, talc, magnesium 
stearate, stearic acid, and other excipients, colorants, 
diluents, buffering agents, moistening agents, 
preservatives, flavoring agents, and pharmacologically 
compatible carriers . 

30 Similarly, a formulation suitable for oral 

administration can include lozenge forms, which can 
comprise the active ingredient in a flavor, usually sucrose 
and acacia or tragacanth; pastilles comprising the active 
ingredient in an inert base, such as gelatin and glycerin, 

35 or sucrose and acacia; and mouthwashes comprising the 
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active ingredient in a suitable liquid carrier; as well as 
creams, emulsions, gels, and the like containing, in 
addition to the active ingredient, such carriers as are 
known in the art . 
5 An aerosol formulation suitable for administration via 

inhalation also can be made. The aerosol formulation can 
be placed into a pressurized acceptable propellant, such as 
dichlorodif luorome thane, propane, nitrogen, and the like. 

A formulation suitable for topical application can be 

10 in the form of creams, ointments, or lotions. 

A formulation for rectal administration can be 
presented as a suppository with a suitable base comprising, 
for example, cocoa butter or a salicylate. A formulation 
suitable for vaginal administration can be presented as a 

15 pessary, tampon, cream, gel, paste, foam, or spray formula 
containing, in addition to the active ingredient, such 
carriers as are known in the art to be appropriate. 

Important general considerations for design of 
delivery systems and compositions, and for routes of 

20 administration, for polypeptide drugs also apply (Eppstein, 
CRC Crit . Rev. Therapeutic Drug Carrier Systems 5, 99-139, 
1988; Siddiqui et al . , CRC Crit. Rev. Therapeutic Drug 
Carrier Systems 3, 195-208, 1987); Banga et al . , Int. J. 
Pharmaceutics 48, 15-50, 1988; Sanders, Eur. J. Drug Metab. 

25 Pharmacokinetics 15, 95-102, 1990; Verhoef, Eur . J . Drug 

Metab . Pharmacokinetics 15, 83-93, 1990) . The appropriate 
delivery system for a given polypeptide will depend upon 
its particular nature, the particular clinical application, 
and the site of drug action. As with any protein drug, 

30 oral delivery will likely present special problems, due 

primarily to instability in the gastrointestinal tract and 
poor absorption and bioavailability of intact, bioactive 
drug therefrom. Therefore, especially in the case of oral 
delivery, but also possibly in conjunction with other 

3 5 routes of delivery, it will be necessary to use an 
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absorption- enhancing agent in combination with a given 
polypeptide. A wide variety of absorption-enhancing agents 
have been investigated and/or applied in combination with- 
protein drugs for oral delivery and for delivery by other 
5 routes (Verhoef, 1990, supra ; van Hoogdalem, Pharmac . Ther . 
44, 407-43, 1989; Davis, J . Pharm ■ Pharmacol . 44(Suppl. 1) , 
186-90, 1992) . Most commonly, typical enhancers fall into 
the general categories of (a) chelators, such as EDTA, 
salicylates, and N-acyl derivatives of collagen, (b) 
10 surfactants, such as lauryl sulfate and polyoxyethylene-9- 
lauryl ether, (c) bile salts, such as glycholate and 
taurocholate, and derivatives, such as 

taurodihydrof usidate, (d) fatty acids, such as oleic acid 
and capric acid, and their derivatives, such as 

15 acylcarnitines , monoglycerides , and diglycerides , (e) non- 
surfactants, such as unsaturated cyclic ureas, (f) 
saponins, (g) cyclodextrins , and (h) phospholipids. 

Other approaches to enhancing oral delivery of protein 
drugs can include the aforementioned chemical modifications 

20 to enhance stability to gastrointestinal enzymes and/or 

increased lipophilicity . Alternatively, the protein drug 
can be administered in combination with other drugs or 
substances that directly inhibit proteases and/or other 
potential sources of enzymatic degradation of proteins. 

25 Yet another alternative approach to prevent or delay 
gastrointestinal absorption of protein drugs is to 
incorporate them into a delivery system that is designed to 
protect the protein from contact with the proteolytic 
enzymes in the intestinal lumen and to release the intact 

3 0 protein only upon reaching an area favorable for its 

absorption. A more specific example of this strategy is 
the use of biodegradable microcapsules or microspheres, 
both to protect vulnerable drugs from degradation, as well 
as to effect a prolonged release of active drug (Deasy, in 

35 Microencapsulation and Related Processes , Swarbrick, ed., 
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Marcell Dekker, Inc.: New York, 1984, pp. 1-60, 88-89, 208- 
11) . Microcapsules also can provide a useful way to effect 
a prolonged delivery of a protein drug after injection 
(Maulding, J. Controlled Release 6, 167-76, 1987) . 
5 The dose administered to an animal, such as a mammal, 

particularly a human, in the context of the present 
invention should be sufficient to effect a therapeutic or 
prophylactic response in the individual over a reasonable 
time frame. The dose will be determined by the particular 

10 polypeptide, nucleic acid, antibody, or anti-antibody 

administered, the severity of any existing disease state, 
as well as the body weight and age of the individual. The 
size of the dose also will be determined by the existence 
of any adverse side effects that may accompany the use of 

15 the particular polypeptide, nucleic acid, antibody or anti- 
antibody employed. It is always desirable, whenever 
possible, to keep adverse side effects to a minimum. 

The dosage can be in unit dosage form, such as a 
tablet or capsule. The term "unit dosage form'' as used 

20 herein refers to physically discrete units suitable as 

unitary dosages for human and animal subjects, each unit 
containing a predetermined quantity of a vector, alone or 
in combination with other active agents, calculated in an 
amount sufficient to produce the desired effect in 

25 association with a pharmaceutical ly acceptable diluent, 
carrier, or vehicle. The specifications for the unit 
dosage forms of the present invention depend on the 
particular embodiment employed and the effect to be 
achieved, as well as the pharmacodynamics associated with 

30 each polypeptide, nucleic acid or anti -antibody in the 

host. The dose administered should be an "HIV infection 
inhibiting amount" of an above-described polypeptide or 
nucleic acid or an "immune response- inducing effective 
amount" of an above -described polypeptide, an above - 

35 described nucleic acid, or an antibody as appropriate. 
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Another composition provided by the present invention 
is a composition comprising a solid support matrix to which 
is attached an above -described polypeptide, or an anti- 
antibody to an above-described polypeptide. The solid 
5 matrix can comprise other functional reagents including, 
for example, polyethylene glycol, dextran, albumin and the 
like, whose intended effector functions may include one or 
more of the following: to improve stability of the 
conjugate; to increase the half -life of the conjugate; to 

10 increase resistance of the conjugate to proteolysis; to 

decrease the immunogenicity of the conjugate; to provide a 
means to attach or immobilize a functional polypeptide or 
anti-antibody onto a solid support matrix (e.g., see, for 
example, Harris, in Poly (Ethylene Glycol) Chemistry: 

15 Biotechnical and Biomedical Applications , Harris, ed., 
Plenum Press: New York (1992), pp. 1-14). Conjugates 
furthermore may comprise a polypeptide or ant i- antibody 
coupled to an effector molecule, each of which, optionally, 
may have different functions (e.g., such as a toxin 

2 0 molecule (or an immunological reagent) and a polyethylene 

glycol (or dextran or albumin) molecule) . Diverse 
applications and uses of functional proteins and 
polypeptides, attached to or immobilized on a solid support 
matrix, are exemplified more specifically for poly (ethylene 
25 glycol) conjugated proteins or peptides in a review by 
Holmberg et al . (In Poly (Ethylene Glycol) Chemistry: 
Biotechnical and Biomedical Applications, Harris, ed. , 
Plenum Press: New York, 1992, pp. 303-324) . 

In addition, the present invention provides a method 

3 0 of removing HIV from a bodily fluid of an animal. The 

method comprises extracorporeal ly contacting the bodily 
fluid of the animal with a solid-support matrix to which is 
attached an above -described polypeptide or an ant i -antibody 
to an above-described polypeptide. Alternatively, the 
35 bodily fluid can be contacted with the polypeptide or anti- 
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antibody in solution and then the solution can be contacted 
with a solid support matrix to which is attached a means to 
remove the polypeptide or ant i- antibody to which is bound 
HIV gpl20 from the bodily fluid. 
5 Methods of attaching an herein-described polypeptide, 

or an ant i- antibody to a solid support matrix are known in 
the art. "Attached" is used herein to refer to attachment 
to (or coupling to) and immobilization in or on a solid 
support matrix. See, for example, Harris, in Poly (Ethylene 

10 Glycol) Chemistry: Biotechnical and Biomedical 

Appl icat ions , Harris, ed. , Plenum Press: New York (1992), 
pp. 1-14) and international patent application WO 91/02714 
(Saxinger) . Diverse applications and uses of functional 
polypeptides attached to or immobilized on a solid support 

15 matrix are exemplified more specifically for poly (ethylene 
glycol) conjugated proteins or peptides in a review by 
Holmberg et al . (In Poly (Ethylene Glycol) Chemistry: 
Biotechnical and Biomedical Applications , Harris, ed., 
Plenum Press: New York, 1992, pp. 303-324). 

20 The present invention also provides a method of making 

an antibody that binds to gpl20 of HIV under physiological 
conditions. The method comprises labeling an embodiment of 
the present inventive compound to obtain a labeled 
compound. Labeling compounds are within the skill of the 

2 5 ordinary artisan. For example, the present inventive 

compound can be labeled with radioactive atom, such as 125 I 
in the same or a similar manner as was performed in the 
examples provided below. Alternatively, an enzyme, such as 
horseradish peroxidase, can be attached to or incorporated 

3 0 into the present inventive compound. Then by exposing a 

chromogenic or photogenic compound to the compound, a 
signal indicative of the presence and quantity of the 
compound present can be generated. In another alternative, 
a polyhistidinyl moiety can be attached to, or incorporated 
3 5 into, the present inventive moiety so that the present 
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inventive compound will react with high affinity to 
transition metal ions such as nickel, copper, or zinc ions; 
this reaction can be used as the basis to quantify the 
amount of the present inventive compound present at a 
5 particular location. In yet another alternative, the 
present inventive compound can be used as antigen to a 
standard antibody that specifically recognizes an antigenic 
epitope of the present inventive compound. As is well- 
known, the standard antibody can itself be labeled or used 

10 in conjunction with an additional antibody that is labeled 
with an enzyme, radioisotope, or other suitable means. The 
skilled artisan will recognize that there is a plethora of 
other suitable means and methods to label the present 
invent i ve compound . 

15 This present inventive method of making an antibody 

that binds to a gpl2 0 envelope protein of HIV further 
comprises providing a library of synthetic peptides. The 
library consists of a multiplicity of synthetically- 
produced polypeptides that are homologous, and preferably 

20 essentially identical (i.e., having the same primary amino 
acid residue sequence, ignoring blocking groups, 
phosphorylation of serinyl, threoninyl , and tyrosinyl 
residues, hydroxylation of prolinyl residues, and the like) 
or identical, to a continuous region of an HIV gpl20 

25 envelope protein. The polypeptides of the library can be 
any suitable length. While larger regions allow faster 
scanning and tend to preserve non- linear epitopes, shorter 
length polypeptides allow more sensitive screening of the 
primary sequence of the gpl2 0 protein. However, 

30 polypeptides that are too short can lose essential 

secondary structure or cleave reactive sites into one or 
more pieces. Preferably, a mixture of short and long 
polypeptides are incorporated into the library, however, 
the library can consist of polypeptides of a single length 

35 (measured in amino acid residues) . For the sake of 
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convenience the library can be split into multiple parts, 
and screened by parts. Typically, the polypeptides of the 
library will be between about 6 and about 45 amino acid 
residues in length. 
5 Typically, the library will comprise a series of 

polypeptides each having an identical sequence to that of 
gpl20 but having an amino- terminus a particular number of 
amino acids downstream of the amino- terminus of the prior 
polypeptide (see, examples section below) . The distance, 

10 measured in amino acid residues, is referred to as the 

offset. Preferably, libraries that are characterized by 
the existence of an offset, the offset is not greater than 
the product of length of the longest polypeptide measured 
in amino acid residues and 1.5, preferably 1.0, and more 

15 preferably 0.5. The library can be alternatively 

characterized by the existence of an offset not greater 
than 30, preferably 15, and more preferably 4. 

Each polypeptide of the library is substantially 
isolated from every other polypeptide of said library and 

2 0 is located in a known position. For example, each 

polypeptide can be bound to a solid support and that is in 
a vessel or that can be placed in a vessel . The vessel 
preferably enables each polypeptide to be covered in a 
liquid that does not contact any other oligonucleotide of 

25 the library. By way of example, each polypeptide can be 

bound to a bead that is placed in a vessel (or tube) or can 
be bound to the well of a multi-well assay plate. 
Alternatively, an array of polypeptides can be fashioned, 
for example on a microchip device (as is presently used in 

30 some DNA sequencing devices and methods) , and the entire 
array can be bathed in a single solution. 

Each polypeptide is then individually contacted with 
the labeled compound such that a portion of the labeled 
compound can bind with the polypeptide of the library. In 

35 this way, a bound population of each labeled compound of 
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the present invention and an unbound population of the 
labeled compound is generated. The phrase individually 
contacted means that each polypeptide has the opportunity 
to bind with the labeled compound and the quantity of 
5 labeled compound bound by each can be determined. 

The method then comprises removing substantially all 
of the unbound labeled compound from the position occupied 
by each polypeptide. That is, the solution comprising the 
labeled compound is separated from the polypeptides of the 

10 library and the bound population of the labeled compound. 
This can be done by any suitable method, e.g., by 
aspiration and one or more washing steps comprising adding 
a quantity of liquid sufficient to cover all the surfaces 
that were contacted by the labeled compound and aspirating 

15 away substantially all of the wash liquid. 

The amount of labeled compound that remains 
co-localized with each polypeptide of the library is then 
measured to determine the quantity of labeled compound 
bound by each polypeptide. The amount of the present 

2 0 inventive compound bound by each polypeptide can be 

directly evaluated to identify a portion of the HIV gpl20 
envelope protein that binds to an (HIV) -receptor selected 
from the group consisting of CCR5, CXCR4 , STRL33, and CD4 . 
This information is then used to identify and provide an 

25 immunizing compound. The immunizing compound comprises a 
polypeptide comprising an amino acid sequence that is 
homologous to, or preferably is essentially identical to, 
or identical to, the portion of the HIV-1 gpl20 envelope 
protein that binds with CD4, CCR5, CXCR4 , and/or STRL3 3 . 

30 The immunizing protein can be provided by processing gpl2 0, 
e.g., proteolytically digesting gpl20 that has been 
isolated from a preparation of HIV-1. Preferably, however, 
the immunizing compound is prepared synthetically, or by 
genetic engineering, or by a combination of genetic 

35 engineering and synthetic methods. The immunizing compound 
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can comprise a pharmaceutically acceptable substituent , can 
be encoded by a nucleic acid that can be expressed in a 
cell, can be mixed with a carrier, and is an inventive 
aspect of the present invention. 
5 An immunogenic quantity of the immunizing compound is 

then inserted into an animal (e.g., a human, or a rodent, a 
canine, a feline, or a ruminant) in a manner consistent 
with the discussion of a method of raising an antibody to 
the present inventive compbunds that are homologous to 

10 portions of CCR5, CXCR4 , STRL33, and CD4 , above. The 

insertion of the immunizing compound causes the inoculated 
animal to produce an antibody that binds with said portion 
of the HIV gpl2 0 envelope protein. Thus the present 
invention also provides an antibody that binds to an HIV 

15 gpl20 envelope protein, as well as an antigen binding 

protein comprising one or more complementarity determining 
regions of the antibody (e.g., a Fab, a Fab 2 ', an Fv, a 
single-chain antibody, a diabody, and humanized variants of 
all of the above, all of which are within the skill in the 

2 0 art) . 

The antibody or variant thereof is preferably useful 
in detecting or diagnosing the presence of HIV gpl2 0 
envelope protein, and thus HIV, in an animal. The antibody 
is also preferably prevents or attenuates infection of an 
25 animal exposed to HIV, to whom an effective quantity of the 
antibody or a variant thereof, has been administered or 
produced in response to inoculation with the immunizing 
compound. The antibody preferably also is useful in 
treating or preventing (i.e., inhibiting) HIV infection in 

3 0 an animal to whom a suitable dose has been administered or 

in which a suitable quantity of antibody has been produced. 
The antibody is also useful in the study of HIV infection 
of mammalian cells, the host range specificities of HIV 
infection, and preferably, the mechanism by which 
35 antibodies neutralize infectious viruses. 



30 



EXAMPLES 

The following examples further illustrate the present 
invention but, of course, should not be construed as 
5 limiting the scope of the claimed invention in any way. 

Synthetic peptide arrays were constructed in 96 -well 
microtiter plates in accordance with the method set forth 
in WO 91/02714 (Saxinger) , and used to test the binding of 
HIV-Ila! envelope gpl2 0 that had been labeled with 

10 radioactive iodine (radiolabeling by standard methods) . 

After incubating the radiolabeled gpl20 in a well with each 
synthetic peptide, a washing step was performed to remove 
unbound label, and the relative level of radioactivity 
remaining in each well of the plate was evaluated to 

15 determine the relative affinity of each peptide for the 

gpl20. The synthesis of the peptides and the quantity of 
binding between the synthetic peptides and the gpl20 were 
found to be suitably reproducible, precise, and sensitive. 
Initial screening of the entire primary sequence of the 

20 chemokine and CD4 receptor molecules was taken 18 amino 
acid residues at a time. 

The authenticity of the binding signals generated by 
this technique has been repeatedly demonstrated by showing 
that antibodies to CCR5 and CXCR4 are able to inhibit the 

25 binding of radiolabeled gpl20 to the polypeptides derived 
from CCR5 and CXCR4 that show a high affinity for binding 
with gpl20. Additionally, the accuracy of the binding 
assay used hereinbelow is demonstrated by Example 7. 

3 0 Example 1 

This example identifies segments of the CCR5 
co-receptor that bind with gpl20. 

The first column in the table below indicates the 
number of the amino acid in the wild- type CCR5 receptor. 
35 The second column explicitly identifies the peptide 
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sequence. The third column indicates the radioactive 
counts recorded in twenty minutes (i.e., the cpm x 20) 
after the background or non-specific counts had been 
subtracted. The fourth column contains an X in each row 
5 for which the listed polypeptide bound with high affinity 
to gpl20. The fifth and final column contains an X in each 
row wherein the listed sequence binds with substantial 
affinity but is weak in comparison to other samples, 
particularly adjacent samples. 
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SEQ SEG PEPTIDE Counts Peak Non-Peak SEQ 

per 20' Activity Activity ID 

NO: 



Average - 
background 





empty (control) 


7 








1--18 


MD YQ VS S P I YD I NY YT S E 


735 


X 
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5- -22 


VSSPIYDINYYTSEPCQK 


383 




X 


32 


9--26 


IYDINYYTSEPCQKINVK 


228 




X 


33 


13-30 


NYYTSEPCQKINVKQIAA 


6 






34 


17-34 


SEPCQKINVKQIAARLLP 


-44 






35 


21-38 


QKINVKQIAARLLPPLYS 


20 






36 


25-42 


VKQIAARLLPPLYSLVFI 


18 






37 


29-46 


AARLLPPLYSLVFIFGFV 


33 






38 


33-50 


LPPLYSLVF I FGFVGNML 


705 


X 




39 


37-54 


YSLVF I FGF VGNMLVI L I 


347 




X 


40 


41-58 


F I FGF VGNMLVI LILINC 


343 




X 


41 


45-62 


FVGNMLVILILINCKRLK 


62 






42 


49-66 


MLVILILINCKRLKSMTD 


84 






43 


53-70 


L I L I NCKRLKSMTD I YLL 


2 






44 


57-74 


NCKRLKSMTDIYLLNLAI 


25 






45 


61-78 


LKSMTDIYLLNLAISDLF 


210 






46 


65-82 


TDI YLLNLAI SDLFFLLT 


38 






47 


69-86 


LLNLAI SDLFFLLTVPFW 


144 






48 


73-90 


AI SDLFFLLTVPFWAHYA 


41 






49 


77-94 


LFFLLTVPFWAHYAAAQW 


173 






50 


81-98 


LTVPFWAHYAAAQWDFGN 


306 






51 


85- 


FWAHYAAAQWDFGNTMCQ 


212 






52 


89- 


YAAAQWDFGNTMCQLLTG 


494 




X 


53 


93- 


QWDFGNTMCQLLTGLYFI 


1019 


X 




54 


97- 


GNTMCQLLTGLYF I GFFS 


941 


X 




55 


101- 


CQLLTGLYFIGFFSGIFF 


489 




X 


56 


105- 


TGLYFIGFFSGIFFI ILL 


80 






57 


109- 


FIGFFSGIFFI ILLTIDR 


76 






58 


113- 


, FSGIFFIILLTIDRYLAV 


83 






59 


117- 


FF 1 1 LLT I DRYLAWHAV 


77 






60 


121- 


LLTIDRYLAWHAVFALK 


31 






61 


125- 


DRYLAWHAVFALKARTV 


62 






62 


129- 


AWHAVFALKARTVTFGV 


34 






63 


133- 


AVFALKARTVTFGWTSV 


63 






64 


137- 


LKARTVTFGWTSVITWV 


74 






65 


141- 


TVTFGWTSVITWWAVF 


-25 






66 


145- 


GWTSVITWWAVFASLP 


69 






67 


149- 


SVI TWWAVFASLPGI I F 


46 






68 


153- 


WWAVFASLPGI I FTRSQ 


87 






69 


157- 


VFASLPGI I FTRSQKEGL 


54 






70 


161- 


LPGI I FTRSQKEGLHYTC 


118 






71 


165- 


I FTRSQKEGLHYTCSSHF 


98 






72 



33 



169- 


S QKEGLH YT C S S H F P Y S Q 


304 




X 


73 


173- 


GLHYTCSSHFPYSQYQFW 


301 




X 


74 


177- 


TCSSHFPYSQYQFWKNFQ 


367 




X 


75 


181- 


HFPYSQYQFWKNFQTLKI 


1008 




X 


76 


185- 


SQYQFWKNFQTLKIVILG 


1572 


X 




77 


189- 


FWKNFQTLKIVILGLVLP 


40 






78 


193- 


FQTLKIVILGLVLPLLVM 


45 






79 


197- 


KI VI LGLVLPLLVMVI CY 


65 






80 


201- 


LGLVLPLLVMVI CYSGIL 


180 






81 


205- 


LPLLVMVI CYSGI LKTLL 


68 






82 


209- 


VMVI CYSGI LKTLLRCRN 


-8 






83 


213- 


CYSGILKTLLRCRNEKKR 


70 






84 


217- 


ILKTLLRCRNEKKRHRAV 


19 






85 


221- 


LLRCRNE KKRHRAVRL I F 


102 






86 


225- 


RNEKKRHRAVRLI FTIMI 


23 






87 


229- 


KRHRAVRLIFTIMIVYFL 


36 






88 


233- 


AVRLI FTIMI VYFLFWAP 


62 






89 


237- 


IFTIMIVYFLFWAPYNIV 


121 






90 


241- 


MIVYFLFWAPYNIVLLLN 


214 






91 


245- 


FLFWAPYNIVLLLNTFQE 


616 




X 


92 


249- 


APYNIVLLLNTFQEFFGL 


1962 


X 




93 


253- 


IVLLLNTFQEFFGLNNCS 


2134 


X 




94 


257- 


LNTFQEFFGLNNCSSSNR 


293 




X 


95 


261- 


QE F FGLNNC S S S NRLDQ A 


63 






96 


265- 


GLNNC S S SNRLDQAMQ VT 


-31 






97 


269- 


CSS SNRLDQAMQVTETLG 


90 






98 


273- 


NRLDQAMQVTETLGMTHC 


10 






99 


277- 


QAMQVTETLGMTHCCINP 


81 






100 


281- 


VTETLGMTHCC INP I I YA 


15 






101 


285- 


LGMTHCC I NP I I YAFVGE 


282 




X 


102 


289- 


HCC I NP I I YAFVGEKFRN 


200 




X 


103 


293- 


NPI I YAFVGE KFRNYLLV 


162 




X 


104 


297- 


YAFVGEKFRNYLLVFFQK 


596 


X 




105 


301- 


GEKFRNYLLVFFQKHIAK 


69 






106 


305- 


RNYLLVFFQKHIAKRFCK 


65 






107 


309- 


LVFFQKHIAKRFCKCCSI 


76 






108 


313- 


QKHI AKRFCKCCS I FQQE 


23 






109 


317- 


AKRFCKCCS I FQQEAPER 


64 






110 


321- 


CKCCSIFQQEAPERASSV 


53 






111 


325- 


S I FQQEAPERASSVYTRS 


100 






112 


329- 


QEAPERASSVYTRSTGEQ 


84 






113 


333- 


ERASSVYTRSTGEQEISV 


84 






114 


337- 


SVYTRSTGEQE I SVGL 


47 






115 



These data indicate that, in addition to polypeptide 
sequences derived from positions 1-18 of the CCR5 receptor, 
the polypeptide sequences LPPLYSLVFIFGFVGNML (SEQ ID NO: 
5 11), QWDFGNTMCQLLTGLYFIGFFS (SEQ ID NO: 12), 



34 

SQYQFWKNFQTLKIVILG (SEQ ID NO: 13), A P YN I VLLLNT FQ E F FGLNNC S 
(SEQ ID NO: 14), and YAFVGEKFRNYLLVFFQK (SEQ ID NO: 15) 
comprise multiple subsequences, each which is capable of 
binding to HIV-1 envelope gpl20. 

5 

Example 2 

This example identifies segments of the CXCR4 
co-receptor that bind with gpl20. 

The first column in the table below indicates the 

10 number of the amino acid in the wild-type CXCR4 receptor. 
The second column explicitly identifies the peptide 
sequence. The third and fourth columns indicate the 
radioactive counts recorded in twenty minutes (i.e., the 
cpm x 20) after the background or non-specific counts had 

15 been subtracted. The fifth column contains an X in each 
row for which the listed polypeptide bound with high 
affinity to gpl20. The sixth and final column contains an 
X in each row wherein the listed sequence binds with 
substantial affinity but is weak in comparison to other 

2 0 samples, particularly adjacent samples. 
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SEQ SEG 



PEPTIDE 



Major 
Activity- 
Peak 







empty (control) 


412 


0 


1-18 


MEGIS I YTSDNYTEEMGS 


3003 


2591 


5- - 


22 


S I YTSDNYTEEMGSGDYD 


483 


71 


9- - 


26 


SDNYTEEMGSGDYDSMKE 


455 


43 


13- 


30 


TEEMGSGDYDSMKEPCFR 


453 


41 


17- 


34 


GSGDYDSMKEPCFREENA 


384 


-28 


21- 


38 


YDSMKEPCFREENANFNK 


465 


53 


25- 


42 


KEPCFREENANFNKIFLP 


664 


252 


29- 


46 


FREENANFNKI FLPTIYS 


463 


51 


33- 


50 


NANFNKI FLPTIYS I I FL 


585 


173 


37- 


54 


NKI FLPTIYS I I FLTGIV 


550 


138 


41- 


58 


LPTIYSI IFLTGIVGNGL 


530 


118 


45- 


62 


YS I I FLTGI VGNGLVI LV 


535 


123 


49- 


66 


FLTGI VGNGLVI LVMGYQ 


658 


246 


53- 


70 


I VGNGLV I LVMGYQ KKLR 


650 


238 


57- 


74 


GLVI LVMGYQ KKLRSMTD 


569 


157 


61- 


78 


LVMGYQ KKLRS MTD KYRL 


517 


105 


65- 


82 


YQKKLRSMTDKYRLHLSV 


511 


99 


69- 


86 


LRSMTDKYRLHLSVADLL 


572 


160 


73- 


90 


TDKYRLHLSVADLLFVIT 


504 


92 


77- 


94 


RLHLSVADLLFVITLPFW 


548 


136 


81- 


98 


S VAD LL FV I T L P FWAVDA 


665 


253 


85- 


102 


LLFV I TL P FWAVDAVANW 


475 


63 


89- 


106 


I TL P F WAVD AVANW Y FGN 


542 


130 


93 - 


110 


FWAVD AVANWY FGNFL C K 


478 


66 


97- 


114 


D AVANW Y FGN F L C KAVHV 


524 


112 


101 


-118 


NWYFGNFLCKAVHVIYTV 


508 


96 


105 


-122 


GNFLCKAVHVIYTVNLYS 


643 


231 


109 


-126 


C KAVHV I YTVNL Y S S VL I 


655 


243 


113 


-130 


HV I YTVNL YS S VL I LAF I 


530 


118 


117 


-134 


T VNL YS S VL I LAF I S LDR 


654 


242 


121 


-138 


YS S VL I LAF I SLDRYLAI 


569 


157 


125 


- 142 


L I LAF I SLDRYLAI VHAT 


519 


107 


-L 




F I SLDRYLAI VHATNSQR 


503 


91 


133 


-150 


DRYLAIVHATNSQRPRKL 


580 


168 


137 


-154 


AIVHATNSQRPRKLLAEK 


485 


73 


141 


-158 


ATNSQRPRKLLAEKWYV 


490 


78 


145 


-162 


QRPRKLLAEKVVYVGVW I 


539 


127 


149 


-166 


KLLAE KWYVGVW I PALL 


501 


89 


153 


-170 


E KWYVGVW I PAL L LT I P 


559 


147 


157 


-174 


YVGVW I PALLLT I PDF I F 


536 


124 


161 


-178 


W I PALLLT I PD F I FANVS 


594 


182 


165 


-182 


LLLT I PD F I F ANVS E ADD 


1418 


1006 


169 


-186 


I PDFI FANVSEADDRYI C 


850 


438 


173 


-190 


I FANVS EADDRY I CDRFY 


679 


267 


177 


-194 


VS E ADDRY I CDRF Y PNDL 


569 


157 


181 


-198 


DDRYICDRFYPNDLWVW 


537 


125 



Minor SEQ 
Activity ID 
Peak NO : 

116 
117 
118 
119 
120 
121 
122 
123 
124 
125 
126 
127 
128 
129 
130 
131 
132 
133 
134 
135 
136 
137 
138 
139 
140 
141 
142 
143 
144 
145 
146 
147 
148 
149 
150 
151 
152 
153 
154 
155 
156 
157 

x 158 
159 
160 
161 
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185-202 


I CDRF YPNDLWVWFOFO 


71 ft 
/ x o 


^ Ofi 
J u o 






162 


189-206 


FYPNDLWVWFOFOHIMV 

J. X i IviyjJ TV V V V i. yilXl 1 V 


o ._. o 


4 1 

rx X VJ 




Jv 


163 


193-210 


dlwvvvfofohimvgl.il. 


ft *^4 

O Jl 


4 22 






1 64 

X D *± 


197-214 


VVFOFOHIMVGLILPGIV 


xuux 


Rft Q 

-J O _7 






1 6 S 

X O 


201-218 


FOHIMVGLTLPGIVTLSP 

1. yXX * 1 * VJXJ X 4J£ VJX V X UiJ V~ 


Rft 2 


1 70 

X / v/ 






1 66 

XDD 


205-222 


MVGLTLPGTVTLSGYPTT 

n v vjjj x i—i t vj x v x xj »-J j. x J- 


-J / -7 


1 67 

X O i 






1 67 
x o / 


209-226 


TT.PGTVTL^rYPT T T <3KT. 
x urvji v x uo^- j. x x x o xvxj 


04 

O VJ r± 


1 Q2 

X _7 ^ 






1 6ft 
X o o 


2 13 -23 0 


X V lUOU J- V— XXX urUJOnDlV 


fift Q 

O O «7 


2 77 






1 6 Q 

X D J 


217-234 


k_>V_XV_XXXk_> XVXJ o n O 1 VVJX1 y IV 


D / X 


^ J 7 






17 0 

X / VJ 


£-> £-> X £i «J O 


x x x o xvxj o n o rvvxriy r\_r\. i\rvu 




1 ^7 
x O / 






17 1 

X / X 


22 R -249 


x\xj o n o x\,vjxx\^ xvjrv ivnu xv x x v 


R4 2 


17 0 
X J u 






17 9 

X / £. 


229-246 


CJKGHOKRKAT.KTTVTT.TT. 

O XVv_7XX<^ XVXvXVJr^XJ Xv X X V X XJ X XJ 


c;2 
-j ~j __ 


14 0 
X *± u 






1 77 


27 7-9^0 


OKP K A T . KTTV T T, T T . A W W A 
yi\xvi\rixji\.i x v xuxixrir rn 


D 27 3 


2 ft 7 
^ o o 






1 74 

X / H. 


237-254 


AT.KTTVTT.TT.AFFAPWT.P 
x v x x v xuxxxrir c nv. »v xj xr 




261 

— OX 






X / _) 


241-258 


TVTT.TT.APPArWT.PYYTn 
x v x xj x xinr c rtv< » t xj xr x x x vj 


7"^ ^ 
zoo 


7 2 7 

-3 ^ J 






176 

X / D 


24*5-26? 


TTiAFFArWT.PYYTfJT^TD 
x xirir xr rt. v^ vk xj xr x x x vj x o x J-J 




1 ft4 
x o *± 






1 77 

X / / 


249-266 


FAGWT.PYYTGT^Tn^FTT. 
x rt. v_ v v xj xr x xx vj luiuor xu 


1 4 
u x*± 


2 02 






1 7ft 
X / o 


253 -270 


LPYYTGT ^TD^FTT.T.FT T 

XJ IT X J. XVJX O X L>u XT X XJ XJ X_j X X 


ft m 

O JX 


4 7 9 






17 9 

X / _7 


257-274 


TGT^TD^FTT.T.FT TTCOGP 

XV7XOXXJOX7 X XJXJXXj X X Iv^NJV^ 


1 1 4 

X X *± D 


77 4 




Jv 


1 ft 0 

X O VJ 


261 -27fi 


TD^FTT.T.FT TKOGCFFFKT 

X X> O X X XJXJXXj X X ivy Ov. Xx. X7 Hi XV 


ft ft4 
J o o ^ 


74 72 


Y 




1 ft 1 

X O X 


£* vJ J A O ^-i 


T T .T ,F T T KOGCF FFNTTVHK 

X XJ XJ X_i X X Ivy v7V>£j X X7-tXM X V XiXV 


-J -7 


117 

XX/ 






1 ft 9 

X O *L 


269-286 

U -/ A o vj> 


x x ivyvjv-i-jr ijiM i v Xi.rv.vv x o x 


R1 ft 
~J X O 


1 06 

XUD 






1 ft 7 

X o s> 


277 - o on 


vJv_X7-i 17 XTjXM X V XTrvVX X O X X ILirtXJ 


D / O 


2 64 

-o O *± 






1 ft4 


277-2 94 


FMTVHKWT ^TTFAT.AFFH 
x_ii>i x v n rv vv x o x x x_ijri.j_Lrt.jr xr xx 


7 2 7 


7 1 R 






1 ft ^ 

X O ZJ 


2ff1 -29ft 

^1 O X .<_. _7 O 


HKWT9TTFAT.AFFHrrT,N 

XI XV VV X O X X XXj.rtXJJri.X7 X7 XI \_. V_ XJXN 


~J / ID 


1 67 






1 ft6 

X O D 


285-302 


^TTRAT.AFFHCCT.NPTLY 
o x x r ^* n 1 c nvrV^ijiN xr x xj x 


0 0 


1 ftft 

X o o 






1 ft 7 

X o / 


289-306 

.__ VJ _7 J U V 


ALAFFHrPT.MPTT.YAFT.G 

rt 1 IrtX X XX \— V_XJXM xr X XJ XX^XT XJw 


«J* -7 -J> 


1 ft 1 

X O X 






1 ft ft 

-LOO 


9 Q7 - 7 1 0 


FHPPT ,NP T T ,YA FT .G AKFK 

X7 flv-v-XJlM XT X XJ Xrt.X7 XJvJrt.X\.X7 XV. 


j j j 


1 27 

X6 J 






1 ft 9 

X O -7 


9 97-71 4 


T.NPTT.YAFT.GAKFKT^AO 

XJ1N XT X XJ X .rt X7 XJ Orrt.X\.X7 Xv. X Oriy 


D O D 


2 74 






1 90 

X _7 VJ 


301-318 

_> Vv X ~3 X O 


T.YAFT.GAKFKT^AOHAT.T 

XJ X rt. X7 XJVj7rt.Xv.X7 XV X Ort.y/XXrtXJ X 




i c;6 

X _> D 






1 91 

X -7 X 


305-322 


T.GAKFKT^AOWAT.T^V^R 

XJv_Jrt.XVX7 XV X Ort.ylJ_rt.XJ X O V OA. 


fil 2 
u x _ 


2 0 0 

— - VJ Vj 






192 


309-326 


FKTSAOHALTSVSRGSSL 


585 


173 






193 


313-330 


AQHALTSVSRGSSLKILS 


559 


147 






194 


317-334 


LTSVSRGSSLKILSKGKR 


595 


183 






195 


321-338 


SRGSSLKILSKGKRGGHS 


581 


169 






196 


325-342 


S LKI L S KGKRGGH S S VS T 


697 


285 






197 


329-346 


LSKGKRGGHSSVSTESES 


597 


185 






198 


333-350 


KRGGHSSVSTESESSSFH 


579 


167 






199 


337-352 


HSSVSTESESSSFHSS 


515 


103 






200 



These data indicate that, in addition to polypeptide 
sequences derived from positions 1-18 of the CXCR4 
receptor, the polypeptide sequences LLLT I PDF I FANVS EADD (SEQ 
5 ID NO: 16) (165-182), WFQFQHIMVGLILPGIV (SEQ ID NO: 17) 

(197-214), and I DS F I LLE 1 1 KQGCE FEN (SEQ ID NO: 18) (261-278) 
comprise multiple subsequences, which is capable of binding 
to HIV-1 envelope gpl20. 
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Example 3 

This example identifies segments of the STRL33 
co-receptor that bind with gpl20. 
5 The first column in the table below indicates the 

number of the amino acid in the wild- type STRL3 3 receptor. 
The second column explicitly identifies the peptide 
sequence. The third and fourth columns indicate the 
radioactive counts recorded in twenty minutes (i.e., the 

10 cpm x 20) after the background or non-specific counts had 
been subtracted. The fifth column contains an X in each 
row for which the listed polypeptide bound with high 
affinity to gpl20. The sixth and final column contains an 
X in each row wherein the listed sequence binds with 

15 substantial affinity but is weak in comparison to other 
samples, particularly adjacent samples. 
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Major 



Minor 



SEQ 



SEQ SEG 


PEPTIDE 










Activity 
Peak 


Activity 
Peak 


ID 
NO: 




empcy vconcroi; 


O A 

- -54 


c 
. b 


O A 

o4 


c 
. b 








T 1 Q 

1 — 1 O 


MArjHDxniiJJxvjr bbr JNIJJo 


11 /o 


. 5 


t o o n 
13 2 0 


. b 




A 


OAT 

2 01 


c o o 
b — 2 2 


T^TTTTT^'TTT/'A C O C? TPTvTTT C O A TP "C 

DxrirjJJxvjr bbf JNJJbbQrjhj 


"3 "D C O 


c 
. 5 


"5 C O Q 

o 68 9 


. b 




A 


o n o 
2 0 2 


Q O C 

9 — 


TTT/' A TT» O O "CMXTC? CAPDIJA A C 


O C O Q 

ob /9 


c 
. b 


8 90 9 


c 
. b 


A 




o r\ o 
2 0 3 


i "2 on 
13 - jU 




2 bo 9 


c 
. b 


o o c o 

2 /b / 


c 
. b 




A 


O Pt A 

2 0 4 


1 *7 O A 

1 / - 3 4 


DbbyiirjriijAr Jj^r bKVr Li 


q a q 
ob 9 


c 
. b 


O 1 c o 
2152 


c 
. b 




A 


one 
2 0b 


21-38 


CiiiriyAr J-i^r bKVr JjFCMx 


O O T C 

2 J 16 


c 
. b 


T Q T Cx 
1819 


c 
. b 




v 
A 


one 
2 0b 


i c /in 


Af LiUr bKVr L1FL.IYI x li Vvr 1 


1 A O 1 

142 1 


c 
. b 


1 O C Q 

1 6 b9 


c 
. b 


/ 


A 


o n o 
2 0/ 




T7 O T/T 7T7T TTf'lUT'VT T TT 7TJT TPP T 

r bKVr LrLMiLVVr VCLjJj 


C O /I 

bo4 


c 
. b 


coo 
6 3 3 


c 
. b 






o n q 
2 0 8 


■jo r- n 
3 3 - 5 0 


riT TTAlVyiT/T TnTDTrPPT TTPMC 

F Li PCM x li W F V CvjLi VCjN b 


605 


. 5 


o o o 


. b 






o n a 
2 0 9 


i / -b4 


lUTVT T TT TDT 7AAT T7ATvTOT T TT T 7 

MxliVVr VC(jJ_iV(jJNIbL»Vij V 


T C O 

168 


c 
. 5 


o o c 
2.3 5 


. b 






o t n 
210 




T 7 TTT rppT T TPMP T T TT T 7X O X T? 

Vr VUtjli VvjJMbliVij V Iblr 


con 
5 /0 


c 
. 5 


O O /I 

284 


c 
. b 






O T T 
211 




AT T rPMOT T TT UTCT PVUVT 

LrliVCjNbliVijVlblr xrlJvli 


T ^ A 

164 


c 
. b 


Q C 

9b 


c 
. b 






O T O 

212 


49-66 


XT r~i T T TT TrTCTDVUVT A C? T rp 

NSLiVLiVl S I F x HKL1QSL1T 


T O C C 

12 55 


. 5 


T O O O 

13 78 


c 
. 5 




A 


o n o 
213 


c 0 0 n 


T TTTCTDVUVT ACT TTTT TTT 1 T 

l/Vlblr xHKLiQbljTDVFLi 


T O f\ 

162 0 


. 5 


T O O /T 

17 8 0 


. 5 


A 




O T A 

2 14 


CO O /I 

b / - /4 


X E'VUT/T ACT rp pyr IDT T 7TVTT TT 

lr xrlKJ-iybLil JJVr LiVNIjP 


T O O C 

12 /b 


. 5 


T O C C 

12 bo 


. b 




A 


O 1 c 
21b 


0 1 - / 0 


VT ACT Tm 7T?T T TXTT TTT AHT 

iSljCJblil JJVr li VJU1jP1jAJJJ_i 


>1 T O 

412 


c 
. 5 


O /I o 

348 


c 
. 5 






o i a 
21b 


C Q O 




o o o 


c 
. b 


o o c 
3 3 6 


c 
. 5 






O T o 
21/ 


6 9-86 


T7»T T TTVTT TTT A T*\T TTCTTPTIT TT TT 1 

FL VNLi PIjADIjVFVCTIj P F 


O /T 

7 0 


. 5 


C T 

51 


. 5 






O T Q 
2 1 O 


7 J - 90 


IjPIjADLVFVCx LiPFWAx a 


ceo 

557 


. 5 


o zt n 
960 


. 5 




A 


O T Q 

219 


O O O /I 

/ / - y4 


TOT UDUPTT TT ITTaT TWAPTUU 

JJIi vr VLlLfr WAxALylrlli 


T T T C 
1116 


c 

. b 


T r\ £C O 

1063 


c 
. b 




v 
A 


o o n 
Z <L U 


0 T r\ 0 

0 1 -98 


Vtl LiPr WAxACjlrlhiW Vr Lj 


T o t a 
1819 


. b 


T O C /I 

1 /54 


. b 




A 


O O 1 

2 2 1 


o5 - 102 


TT"CT*7 A TT A Z" 1 X T_T TT" T»TT 7"CA AT 7Ti/I A 

Pr WAxAOlrliiW VrCjy VMC 


o o ^ o 
/ 262 


. b 


o c o o 
/b3 / 


. b 


V" 

A 




■ o o O 
' 2 2 2 


8 9-106 


V7\A T T-J'CTaTT TTP/^ 1 AT 7TV/T ^ V O T T 

x AVj 1 riili W V t (jy V M C Kb liJ-i 


C Q T T 

5 911 


c 
. 5 


a o a c 
624 b 


c 
. b 




v 
A 


o o o 
2 3 


Q O T T n 


TJT^TaTT 7 , 'C , /^*/OT 7TVA/^*t< r C?T T /^TT/HP 

rlii W V r v^y VMCrvbLilAjl x 1 


O O Q T 

jo 91 


c 
. b 


34 66 


c 
. b 




A 


O O /l 

2 2 4 


qh t t /i 
9 / - ±14 


TP/^i AT TTV/r PVCT T AT VT 1 T TvT T?X7 

r VMUKbJ-ilAjl x 1 IJMr x 


TOCO 

12b / 


c 
. b 


1 O C vl 

13 b4 


c 
. b 




Y 

A 


O O cr 
2 2b 


t n t 1 1 0 


ML-Kblilitjl x 1 IJMr x 1 bMLi 


t c r\ c 
150 5 


c 
. b 


T O Q O 

12 83 


c 
. b 






O O C 
Z Z b 


T f\ C TOO 

105-122 


T T A T VT 1 T XT'C 1 T7'»T» CIVTIT XT rp A 

JjXjGIYTINFxTSMLiIIjTC 


/ion 
499 


. 5 


4 08 


. 5 






O O "7 

2 2/ 


i n q lo^r 
109-12 6 


\7-rp "T ■KTT7 , T7' r P O 7VIT XT TAT TT TTT 

x 1 IJMr x 1 bMlillilCl 1 VJJ 


OCT 

o bl 


c 
. b 


c t n 
b 1 0 


c 
. b 






O O Q 


113 - 1j 0 


T?T7' r P01VAT XT r P/**» X TT 7TTTO "C XT 7 

r X 1 bMlilLil LI 1 VJJKr IV 


/4 4 


c 
. b 


90 / 


c 

. b 






O O Q 


T T O T O /I 

117 - 1 34 


MLi 1 LjTC I TVDRF I VvVKA 


O Q Q 

2 98 


c 
. b 


O O Q 

2 2 8 


c 
• b 






o o n 


tot no 
12 1 - lj 0 


rp A X TT 7TT"D I? X T T\ 7T TV A T* V A V 

1L11 VJJKr 1 VVVKA 1 J\Ax 


Q Q 

8 9 


. b 


o a a 
3 4 o 


c 
. b 






O O 1 


1 OC T vl O 

12b - 142 


T TTTO TT 1 XT TT TT 7V7\ rpr/Ti T/TVTAA A 

VDKr 1 VVVKAI KAYJSiyyA 


t n o 

10-5 


c 
. b 


c o 
b3 


c 
. b 






o o o 


1 O Q T A C 

12 9-146 


XT TT TT 7V A rp V A VKTAA7V VDIUfT 1 

1 VV VJ\AI J\AxJ>jyyAJxKMI 


t c a 
166 


c 
. b 


A O 

43 


c 
. b 






o o o 

^ -5 -5 


TOO 1 cn 

Ijj - lbO 


V A T 1 V A T7XTAA A VD lUrrpr.T a VT 7 

J\A 1 KA x JNJ y yAJN_KM 1 WCjK V 


o n i 
/ 0 1 


c 
• b 


C C Q 

b 6 8 


. b 






O O A 


lo7- 154 


A VMA A A VTTTiyirpT*7A T/T TfppT T 

A x NQQAKKMT WGKVTS Jj J_i 


c c 

55 


. 5 


4 


IT 

. 5 






o o c 
z 3 b 


TAT 1 CO 
14 1 — 1DO 


<^ 7A VO MT'TaT^ 1 VT 7T 1 C T T TTaTWT 
yAJVKlnX WVjKV 1 oJjJ-iIVM V 1 


_ o T 
— / X 


c 
. b 


- 0 1 

- 3 1 


. b 








145-162 


MTWGKVTSLLIWVISLLV 


-0 


.5 


-26 


.5 






237 


149-166 


KVTSLLIWVISLLVSLPQ 


-39 


.5 


-118 


.5 






238 


153-170 


LLIWVISLLVSLPQIIYG 


42 


. 5 


75 


.5 






239 


157-174 


VISLLVSLPQI I YGNVFN 


-60 


.5 


-127 


.5 






240 


161-178 


LVSLPQI I YGNVFNLDKL 


91 


. 5 


-15 


.5 






241 


165-182 


PQ I I YGNVFNLD KL I CGY 


-18 


.5 


-37 


.5 






242 


169-186 


YGNVFNLD KL I CGYHDEA 


-41 


.5 


-20 


.5 






243 


173-190 


FNLDKL I CGYHDEA I S TV 


1072 


.5 


1078 


.5 




X 


244 


177-194 


KL I CGYHDEAI STWLAT 


1363 


.5 


1604 


.5 




X 


245 



39 



181 


-198 


GYHDEAI STWLATQMTL 


754 


.5 


1181 


.5 




X 


185 


-202 


EAI STWLATQMTLGFFL 


3973 


.5 


3745 


. 5 


X 




189 


-206 


TWLATQMTLGFFLPLLT 


2327 


.5 


2389 


. 5 




X 


193 


-210 


ATQMTLGFFLPLLTMIVC 


2365 


.5 


2444 


.5 




X 


197 


-214 


TLGFFLPLLTMIVCYSVI 


2387 


.5 


479 


.5 






201 


-218 


FLPLLTMIVCYSVI IKTL 


1270 


.5 


1195 


. 5 




X 


205 


-222 


LTMI VCYSVI I KTLLHAG 


2787 


. 5 


2654 


. 5 


X 




209 


-226 


VCYSVI IKTLLHAGGFQK 


1334 


. 5 


1143 


. 5 




X 


213 


-230 


VI IKTLLHAGGFQKHRSL 


961 


. 5 


682 


.5 






217 


-234 


TLLHAGGFQKHRSLKI I F 


1041 


. 5 


999 


.5 






221 


-238 


AGGFQKHRSLKI I FLVMA 


340 


.5 


260 


.5 






225 


-242 


QKHRSLKI I FLVMAVFLL 


810 


.5 


814 


.5 






229 


-246 


SLKI I FLVMAVFLLTQMP 


612 


.5 


853 


. 5 






233 


-250 


I FLVMAVFLLTQM P FNLM 


386 


.5 


772 


. 5 






237 


-254 


MAVFLLTQMPFNLMKF I R 


2263 


. 5 


2842 


. 5 


X 




241 


-258 


LLTQMPFNLMKFIRSTHW 


2513 


. 5 


3154 


. 5 


X 




245 


-262 


MPFNLMKF I RSTHWE YYA 


2171 


. 5 


2182 


. 5 




X 


249 


-266 


LMKFIRSTHWEYYAMTSF 


934 


. 5 


949 


.5 






253 


-270 


I RSTHWE YYAMTS FHYT I 


1571 


.5 


1807 


. 5 




X 


257 


-274 


HWE YYAMTS FHYT I MVTE 


2040 


. 5 


3065 


. 5 


X 




261 


-278 


YAMTS FHYT I MVTEAI AY 


2688 


. 5 


2359 


. 5 




X 


265 


-282 


S FHYT I MVTEAI AYLRAC 


761 


. 5 


1033 


. 5 






269 


-286 


TIMVTEAIAYLRACLNPV 


140 


. 5 


272 


. 5 






273 


-290 


TEAIAYLRACLNPVLYAF 


604 


. 5 


480 


. 5 






277 


-294 


AYLRAC LNPVLYAFVSLK 


1802 


. 5 


1849 


. 5 




X 


281 


-298 


ACLNPVLYAFVS LKFRKN 


4173 


. 5 


4515 


.5 


X 




285 


-302 


PVLYAFVSLKFRKNFWKL 


1859 


.5 


2147 


. 5 




X 


289 


-306 


AFVS LKFRKN FWKLVKD I 


808 


.5 


1040 


. 5 






293 


-310 


LKFRKNFWKLVKD IGCLP 


920 


. 5 


957 


. 5 






297 


-314 


KN FWKLVKD I GCLPYLGV 


143 


.5 


82 


. 5 






301 


-318 


KLVKD I GCL P YLGVS HQW 


-2 


.5 


27 


. 5 






305 


-322 


DIGCLPYLGVSHQWKSSE 


17 


.5 


78 


. 5 






309 


-326 


L P YLGVSHQWKS S EDNS K 


111 


.5 


122 


.5 






313 


-330 


GVSHQWKSSEDNSKTFSA 


208 


.5 


306 


.5 






317 


-334 


QWKSSEDNSKTFSASHNV 


464 


.5 


533 


.5 






321 


-338 


S E DNS KT F S AS HNVE AT S 


524 


.5 


434 


.5 






325 


-342 


SKTFSASHNVEATSMFQL 


1524 


.5 


1239 


.5 


X 





246 
247 
248 
249 
250 
251 
252 
253 
254 
255 
256 
257 
258 
259 
260 
261 
262 
263 
264 
265 
266 
267 
268 
269 
270 
271 
272 
273 
274 
275 
276 
277 
278 
279 
280 
.281 
282 
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These data indicate that, in addition to polypeptide 
sequences derived from positions 9-26 of the STRL33 
receptor, the polypeptide sequences LVI S I FYHKLQSLTDVFL (SEQ 
ID NO: 19) (53-70), PFWAYAGIHEWVFGQVMC (SEQ ID NO: 20) (85- 
102), EAI STWLATQMTLGFFL (SEQ ID NO: 21) ( 185-202) f 
LTMI VCYSVI I KTLLHAG (SEQ ID NO: 22) (205-222), 
MAVFLLTQMPFNLMKF I RSTHW (SEQ ID NO: 2 3 ) (23 7 -258 ) , 
HWE Y YAMT S FHYT I MVTE (SEQ ID NO: 24) (257-274), 
ACLNPVLYAFVSLKFRKN (SEQ ID NO: 25) (281-298) and 



40 

SKTFSASHNVEATSMFQL (SEQ ID NO: 26) (325-342) comprise 
multiple subsequences, which is capable of binding to HIV-1 
envelope gpl20. 

5 Example 4 

This example identifies segments of the human CD4 
protein that bind with gpl20. 

The second column in the in the table below identifies 
the amino acid residue sequence .of the polypeptide employed 

10 in the assay. The first column identifies the sequence 

coordinates of human CD4 that have an identical amino acid 
sequence. The third column indicates the number of 
radioactive decays (i.e., counts) that were counted, which 
is indicative of the affinity of the synthetic polypeptide 

15 for the gpl20 protein. In the table below, polypeptides 
retaining more than 4,000 counts identify fragments that 
have a substantial capability to bind with gpl20. 
Polypeptides retaining more than 6,000 counts have more 
substantial binding affinity. Polypeptides retaining at 

20 least about 10,000 counts have a substantial and strong 
capacity to bind to gpl20. Of course, fragments 
corresponding to amino acid coordinates 101-121 and 106-126 
have a substantial, strong, and dominant capacity to bind 
to gpl20. 

25 SEQ ID NO: 



Bl 


( 1) 


1 


-21 


MNRGVPFRHLLLVLQLALLPA 


3587 


283 


CI 


( 2) 


6 


-26 


PFRHLLLVLQLALLPAATQGK 


4356 


284 


Dl 


( 3) 


11 


-31 


LLVLQLALLPAATQGKKWLG 


1785 


285 


El 


( 4) 


16 


-36 


LALLPAATQGKKWLGKKGDT 


1759 


286 


Fl 


( 5) 


21 


-41 


AATQGKKWLGKKGDTVE LTC 


1562 


287 


Gl 


( 6) 


26 


-46 


KKWLGKKGDTVE LTCTAS QK 


1910 


288 


HI 


( 7) 


31 


-51 


GKKGDTVELTCTASQKKS IQF 


1831 


289 


A2 


( 8) 


36 


-56 


TVELTCTASQKKS IQFHWKNS 


1732 


290 


B2 


( 9) 


41 


-61 


CTASQKKS I QFHWKNSNQI KI 


1717 


291 


C2 


(10) 


46 


-66 


KKS I QFHWKNSNQ I KI LGNQG 


2182 


292 


D2 


(11) 


51 


-71 


FHWKNSNQI KI LGNQGS FLTK 


1835 


293 


E2 


(12) 


56 


-76 


SNQIKILGNQGSFLTKGPSKL 


1487 


294 


F2 


(13) 


61 


-81 


I LGNQGS FLTKGPSKLND RAD 


1467 


295 


G2 


(14) 


66 


-86 


GSFLTKGPSKLNDRADSRRSL 


1844 


296 


H2 


(15) 


71 


-91 


KGPSKLNDRADSRRSLWDQGN 


1912 


297 


A3 


(16) 


76 


-96 


LNDRADSRRSLWDQGNFPLI I 


1753 


298 



41 



B3 


(17) 


81 


-101 


DSRRSLWDQGNFPLIIKNLKI 


2224 


299 


C3 


(18) 


86 


-106 


LWDQGNFPLI IKNLKIEDSDT 


3264 


300 


D3 


(19) 


91 


-111 


NFPLII KNLKI EDSDTYI CEV 


11646 


301 


E3 


(20) 


96 


-116 


I KNLKI EDSDTYI CEVEDQKE 


8439 


302 


F3 


(21) 


101 


-121 


I EDSDTYI CEVEDQKEEVQLL 


6803 


303 


G3 


(22) 


106 


-126 


TYI CEVEDQKE EVQLLVFGLT 


44965 


304 


H3 


(23) 


111 


-131 


VEDQKEEVQLLVFGLTANSDT 


36249 


305 


A4 


(24) 


116 


-136 


EEVQLLVFGLTANSDTHLLQG 


14171 


306 


B4 


(25) 


121 


-141 


LVFGLTANSDTHLLQGQSLTL 


3683 


307 


C4 


(26) 


126 


-146 


TANSDTHLLQGQSLTLTLESP 


6114 


308 


D4 


(27) 


131 


-151 


THLLQGQSLTLTLESPPGSSP 


2552 


309 


E4 


(28) 


136 


-156 


GQSLTLTLESPPGSSPSVQCR 


1538 


310 


F4 


(29) 


141 


-161 


LTLES P PGS S PS VQCRS PRGK 


1476 


311 


G4 


(30) 


146 


-166 


PPGSSPSVQCRSPRGKNIQGG 


1496 


312 


H4 


(31) 


151 


-171 


PS VQCRS PRGKNIQGGKTLSV 


1400 


313 


A5 


(32) 


156 


-176 


RSPRGKNIQGGKTLSVSQLEL 


2066 


314 


B5 


(33) 


161 


-181 


KNIQGGKTLSVSQLELQDSGT 


3078 


315 


C5 


(34) 


166 


-186 


GKTLSVSQLELQDSGTWTCTV 


2618 


316 


D5 


(35) 


171 


-191 


VSQLELQDSGTWTCTVLQNQK 


3879 


317 


E5 


(36) 


176 


-196 


LQDSGTWTCTVLQNQKKVEFK 


2456 


318 


F5 


(37) 


181 


-201 


TWTCTVLQNQKKVEFKIDIW 


4030 


319 


G5 


(38) 


186 


-206 


VLQNQKKVE FKI D I WLAFQK 


9737 


320 


H5 


(39) 


191 


-211 


KKVEFKIDIWLAFQKASSIV 


6313 


321 


A6 


(40) 


196 


-216 


KI D I WLAFQKAS S I VYKKEG 


3681 


322 


B6 


(41) 


201 


-221 


VLAFQKAS S I VYKKEGEQVE F 


3566 


323 


C6 


(42) 


206 


-226 


KAS S I VYKKEGEQVE FS F PLA 


14347 


324 


D6 


(43) 


211 


-231 


VYKKEGEQVE FSFPLAFTVEK 


14740 


325 


E6 


(44) 


216 


-236 


GE Q VE F S F PLAFTVE KLTGS G 


18549 


326 


F6 


(45) 


221 


-241 


FSFPLAFTVEKLTGSGELWWQ 


9673 


327 


G6 


(46) 


226 


-246 


AFTVE KLTGS GEL WW QAERAS 


3992 


328 


H6 


(47) 


231 


-251 


KLTGSGE LWWQAERAS S S KS W 


1878 


329 


A7 


(48) 


236 


-256 


GELWWQAERASSSKSWITFDL 


2730 


330 


B7 


(49) 


241 


-261 


QAERAS S S KS W I T FDLKNKE V 


2588 


331 


C7 


(50) 


246 


-266 


S S S KS W I TFDLKNKE VS VKRV 


1761 


332 


D7 


(51) 


251 


-271 


WITFDLKNKEVSVKRVTQDPK 


2126 


333 


E7 


(52) 


256 


-276 


LKNKEVSVKRVTQDPKLQMGK 


2288 


334 


F7 


(53) 


261 


-281 


VSVKRVTQDPKLQMGKKLPLH 


1848 


335 


G7 


(54) 


266 


-286 


VTQDPKLQMGKKLPLHLTLPQ 


2075 


336 


H7 


(55) 


271 


-291 


KLQMGKKL PLHLTL PQAL PQY 


1949 


337 


A8 


(56) 


276 


-296 


KKLPLHLTLPQALPQYAGSGN 


1922 


338 


B8 


(57) 


281 


-301 


HLTL PQAL PQYAGSGNLTLAL 


2394 


339 


C8 


(58) 


286 


-306 


QAL PQ YAG S GNLTLAL E AKTG 


2364 


340 


D8 


(59) 


291 


-311 


YAGSGNLTLALEAKTGKLHQE 


1830 


341 


E8 


(60) 


296 


-316 


NLTLALEAKTGKLHQEVNLW 


1676 


342 


F8 


(61) 


301 


-321 


LEAKTGKLHQEVNLWMRATQ 


1729 


343 


G8 


(62) 


306 


-326 


GKLHQEVNLWMRATQLQKNL 


1776 


344 


H8 


(63) 


311 


-331 


EVNLWMRATQLQKNLTCEVW 


2183 


345 


A9 


(64) 


316 


-336 


VMRATQLQKNLTCEVWGPTS P 


2144 


346 


B9 


(65) 


321 


-341 


QLQKNLTCEVWGPTS PKLMLS 


1856 


347 


C9 


(66) 


326 


-346 


LTCEVWGPTS PKLMLSLKLEN 


2412 


348 


D9 


(67) 


331 


-351 


WGPTSPKLMLSLKLENKEAKV 


2414 


349 


E9 


(68) 


336 


-356 


PKLMLS LKLENKEAKVSKREK 


1656 


350 


F9 


(69) 


341 


-361 


SLKLENKEAKVSKREKAVWVL 


1663 


351 


G9 


(70) 


346 


-366 


NKEAKVSKREKAVWVLNPEAG 


1735 


352 


H9 


(71) 


351 


-371 


VSKREKAVWVLNPEAGMWQCL 


2034 


353 


A10 


(72) 


356 


-376 


KAVWVLNPEAGMWQCLLSDSG 


3133 


354 
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BIO (73) 


361 


-381 


LNPEAGMWQCLLSDSGQVLLE 


6316 


355 


CIO (74) 


366 


-386 


GMWQCLLSDSGQVLLESNI KV 


4185 


356 


DIO (75) 


371 


-391 


LLSDSGQVLLESNIKVLPTWS 


2375 


357 


ElO (76) 


376 


-396 


GQVLLESNIKVLPTWSTPVQP 


2089 


358 


FIO (77) 


381 


-401 


ESNIKVLPTWSTPVQPMALIV 


1992 


359 


GIO (78) 


386 


-406 


VL PTW S T P VQ PMAL I VLGGVA 


2197 


360 


HIO (79) 


391 


-411 


S T P VQ PMAL I VLGGVAGLL L F 


2527 


361 


All (80) 


396 


-416 


PMAL I VLGGVAGLLLF IGLGI 


3067 


362 


Bll (81) 


401 


-421 


VLGGVAGLLLFIGLGI FFCVR 


3738 


363 


Cll (82) 


406 


-426 


AGLLLF IGLGI FFCVRCRHRR 


2099 


364 


Dll (83) 


411 


-431 


F I GLGI FFCVRCRHRRRQAER 


1900 


365 


Ell (84) 


416 


-436 


IFFCVRCRHRRRQAERMSQIK 


2085 


366 


Fll (85) 


421 


-441 


RCRHRRRQAERMSQIKRLLSE 


2075 


367 


Gil (86) 


426 


-446 


RRQAERMSQIKRLLSEKKTCQ 


1607 


368 


Hll (87) 


431 


-451 


RMSQI KRLLSEKKTCQCPHRF 


2020 


369 


A12 (88) 


436 


-456 


KRLLSEKKTCQCPHRFQKTCS 


1674 


370 


B12 (89) 


441 


-458 


EKKTCQCPHRFQKTCSPI 


2006 


371 


Al ( 0) 






empty ( control ) 


2075 





Example 5 

This example shows the binding of 125 I -HIV- 1^ gpl2 0 to 
5 the amino termini of CCR5, CXCR4 , and STRL33 as a function 
of the dependence on position and length. Synthetic 
peptide arrays of nonapeptides , dodecapept ides , 
pent adecapept ides and octadecapeptides derived from CCR5 
(panel A) , CXCR4 (panel B) and STRL3 3 (panel C) amino 

10 terminal domains were prepared and utilized to test the 
binding of 125 I -HIV- 1^ envelope gpl2 0. Ordinal sequence 
position numbers are given in accordance with the sequence 
data provided by the Genbank database for CCR5 (accession 
No. gl457946, gi|l457946), CXCR4 (accession No. g539677, 

15 gi|400654, sp|P30991) and STRL33 (accession No. g2209288, 
gi | 2209288). The counts shown are the counts detected in 
each well minus the background counts (i.e., counts 
observed in the assay when no polypeptide was bound to the 
well of the 96-well assay plate) . 
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Panel A Peptide Sequence 
Scanning Windows 

CCR5 

(In each sequence row 
Initial 9-, 12-, 15-, 18-mers 
Sequence share the same initial 
# starting point.) 



Binding Results for Window Length 

(counts bound - background 
(no peptide) ) 



xxxxxxxxx 9 9 SEQ 

xxxxxxxxxxxx 12 12 ID 

xxxxxxxxxxxxxxx 15 15 NO: 

xxxxxxxxxxxxxxxxxx 18 18 



1 


MD YQVS S P I YD I NYYT S E 


543 


2682 


4976 


5880 


372 


2 


DYQVSSPIYDINYYTSEP 


1552 


3089 


5401 


6363 


373 


3 


YQVSSPIYDINYYTSEPC 


2533 


5305 


5415 


6119 


374 


4 


QVSSPIYDINYYTSEPCQ 


490 


1959 


4594 


5645 


375 


5 


VSSPIYDINYYTSEPCQK 


509 


1629 


3280 


3521 


376 


6 


SSPIYDINYYTSEPCQKI 


671 


1739 


3498 


3285 


377 


7 


SPIYDINYYTSEPCQKIN 


1503 


3463 


4575 


3234 


378 


8 


PIYDINYYTSEPCQKINV 


1186 


2285 


2682 


2036 


379 


9 


I YD I NYYTSE PCQKINVK 


1359 


2702 


2516 


1261 


380 


10 


YDINYYTSEPCQKINVKQ 


4379 


5245 


3052 


1913 


381 


11 


DINYYTSEPCQKINVKQI 


1396 


1361 


1144 


712 


382 


12 


INYYTSEPCQKINVKQIA 


1384 


1190 


707 


684 


383 


13 


NYYTSEPCQKINVKQIAA 


1548 


977 


760 


595 


384 


14 


YYTSEPCQKINVKQIAAR 


1029 


1052 


847 


638 


385 


15 


YTSEPCQKINVKQIA 


567 


507 


459 




386 


16 


TSEPCQKINVKQIAA 


440 


427 


509 




387 


17 


SEPCQKINVKQIAAR 


434 


430 


426 




388 


18 


EPCQKINVKQIA 


397 


432 






389 


19 


PCQKINVKQIAA 


386 


385 






390 


20 


CQKINVKQIAAR 


435 


581 






391 


21 


QKINVKQIA 


453 








392 


22 


KINVKQIAA 


487 








393 


23 


INVKQIAAR 


474 








394 
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Panel B 


Peptide Sequence Scanning 


Binding 


Results 


For 


Window 


Length 




Windows 














CXCR4 






(counts 


bound - 


background) 






(In each sequence row 


9-, 












Initial 


12-, 15-, 18-mers share 












Sequence 


the same initial 














# 


starting point . ) 
















xxxxxxxxx 9 




q 












xxxxxxxxxxxx 12 






12 






ID 




xxxxxxxxxxxxxxx 15 








15 




NO: 




"VV "V" "V "V "V"V "V "V "V "VV "VVV "V VV 
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1 


MEGISIYTSDNYTEEMGS 

1 1 J — 1 v — 1 X I—/ X. J. X » — ' i-> J. N J. -L J— 1 J — 1 1 1 VJ U 




591 


334 


3275 


2079 


395 


2 


EGI S I YTSDNYTEEMGSG 




a 


886 


7255 


1548 


396 


3 


GISIYTSDNYTEEMGSGD 




4 54 


2644 


3274 


1217 


397 


4 


ISIYTSDNYTEEMGSGDY 




466 


3973 


2202 


861 


398 


5 


SIYTSDNYTEEMGSGDYD 




a 


288 


168 


239 


399 


6 


I YTSDNYTEEMGSGDYDS 




332 


335 


195 


173 


400 


7 


YTSDNYTEEMGSGDYDSM 

X X Ui^l^l X X 1—1 1— Jl X L/lJl i 




181 


161 


201 


103 


401 


8 


TSDNYTEEMGSGDYDSMK 

X tJl/li| X X JJJJ1 IVJlJVJl/ X Ir/kJl 




a 


54 


119 


38 


402 


9 


S DN YT E EMG S GD YD S M KE 




151 


149 


124 


161 


403 


10 


DNYTEEMGSGDYDSMKEP 




67 


121 


57 


102 


404 


11 


NYTEEMGSGDYDSMKEPC 

XM X X J—J J— 11 1VJU\J1^ X 1— ' * 111 ' x 1 ' X N— » 




a 


100 


30 


134 


405 


12 


YTEEMGSGDYDSMKE PCF 

X X l—i J-il 'IVJu VJiy X J-/ k?ril\£i XT V„. x 




68 


213 


70 


103 


406 


13 


TEEMGSGDYDSMKEPCFR 

X l—i Ul IVJiJVJl/ X l^ljl 1X> 1 1 X J. IV 




146 


67 


23 


47 


407 


14 


E E MGS GD YD 9 M KE P C FR E 




a 


61 


121 


130 


408 


15 


EMGSGDYD^MKEPCFREE 




64 


36 


69 


64 


409 


16 


MGSGDYDSMKEPPFREEN 

riVJuVJL/ x 1— / 0 11 1 VX_J xr V— C XV J—i J—ixM 




57 


68 


64 


129 


410 


17 


GSGDYDSMKEPPFREENA 

VJuVJl/ X. L/kJrHVlJ x V — • X IVi_i J— iXM^i 




a 


155 


172 


155 


411 


18 


SGDYDSMKEPCFREENAN 




100 


118 


186 


89 


412 


19 


GDYDSMKEPCFREENANF 




53 


167 


198 


134 


413 


20 


DYDSMKE PCFREENANFN 




a 


167 


146 


75 


414 


21 


YDSMKEPCFREENANFNK 




171 


144 


80 


89 


415 


22 


DSMKEPCFREENANFNKI 




85 


144 


146 


40 


416 


0 "3 
Z o 


SMKEPCFREENANFN 




a 


1 1 Q 

x ± y 


55 




417 
11 / 


24 


MKEPCFREENANFNK 




188 


133 


74 




418 


25 


KEPCFREENANFNKI 




165 


105 


93 




419 


26 


EPCFREENANFN 




a 


69 






420 


27 


PCFREENANFNK 




104 


108 






421 


28 


CFREENANFNKI 




103 


66 






422 


29 

a Not done 


REENANFNK 




58 








423 
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Panel C 


Peptide Sequence 


Binding Results 


For 


Window Length 




Scanning Windows 












STRL33 




(counts 


bound - 


background) 






(In each sequence 












miLiai 


row y - , ±z- , lb-, 














io nicis uric 












# 


same initial 
starting point.) 














xxxxxxxxx 9 


9 








SEQ 




xxxxxxxxxxxx 12 




12 






ID 




xxxxxxxxxxxxxxx 15 






lb 




NO: 




xxxxxxxxxxxxxxxxxx 1 8 
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x 


rirtunu xnciL/ x \jc oof imjjo 


i £ n 


OZj 


1239 


1386 


A 7 A 
ft Z ft 




rujilJJ IflCjU X \jC OorWUoO 


7 ^ A 
.5 3ft 


£ Q7 


1095 


1014 


A 7 <=; 
ft Z D 


7 


PMnYHPnvnT? c q pun qqd 

c*nu x nDU x ur oorivuoo^ 




Q7 7 


2235 


1219 


A 7 £ 
ft Z O 


*± 




7 n ft 


1/17 7 
X ft Z / 


1772 


1500 


A 7 7 
ft Z / 


cr 


n ytj pti vr^p q c pMn c cop p 


ft c; 1 


X O Oft 


1240 


1191 


A 7 ft 
ft z o 


D 


x nuiLj iur ooFiNiJoo^FjFjri 


too 
/ZO 


1-7 jU 


1357 


985 


A 7 Q 
ft z y 


7 


niLJj iur oof lNXJoouj-CjEiri^ 


TOO 
/ Z J7 


1 O 7 7 
X U / / 


947 


537 


a 7 n 

ft J u 


ft 
o 


i-ji-/ lor oof iNJJo os^rjiiri^M. 


OCT 

-7 J J 


ft 1 7 
OX/ 


1152 


548 


A 7 1 
ft o X 


Q 


nYHPQ C PMT) QCAPFUna P 

U X VjF DoriNL'C O Fjrl^.r-iF 


/ Ul 


c; 7 7 

O / -J 


595 


440 


A 7 7 
ft o Z 


i n 


VOP Q Q PNTD Q GOPPWri A PT . 
Iur OOF 1NUO OV^FjFjilS^/i-lF Xj 


7 a <=; 


/ ft o 


645 


1138 


A 7 7 
ft o o 


X -L 


\jV OOF !NiJOOyiZ»iIiXi.^i-4.F XJ^ 


X / X 


a p n 
ft o u 


270 


1639 


A 7 A 
ft o ft 


X £, 


17 Q q PKTn c Q OT? P WO A PT ,H P 

F OOF IMXyOOvJFjFjxl^i-iF XjV^F 


7 A Q 

Z ft 


a n 7 

ft U j 


361 


3608 


A 7 c: 
ft j D 


1 


Q C Pisrn C onp punA PT .HP Q 
OOF INLJO O^FjijiiV^I^-iF XjV^F O 


7 A 7 
Z ft .5 


Oil 
Aft 


902 


6038 


A 7 £ 
ft J) D 


1 J. 

Xft 


OF lNlJOO^FjFjFlV^i-lF J-iUJF O xv 


7 n a 

3 U ft 


7 n 7 


969 


4537 


A 7 7 
ft o / 


1 R 


PNTD CJ q OP PUH A P T .O P Q irv 


246 


470 


4089 


4678 


438 


16 


NDSSQEEHQAFLQFS 


180 


497 


6160 




439 


17 


DSSQEEHQAFLQFSK 


147 


882 


4588 




440 


18 


SSQEEHQAFLQFSKV 


287 


4455 


4732 




441 


19 


SQEEHQAFLQFS 


647 


7512 






442 


20 


QEEHQAFLQFSK 


1109 


5672 






443 


21 


EEHQAFLQFSKV 


6060 


5598 






444 


22 


EHQAFLQFS 


7505 








445 


23 


HQAFLQFSK 


2761 








446 


i 24 


QAFLQFSKV 


2600 








447 



Example 6 

This example shows 125 I-HIV~1lai gpl2 0 binding to 
5 N-terminal peptide variants of CCR5, CXCR4 and STRL33. 

Octadecapeptide alanine replacement variants of 
maximum gpl20 binding activity peaks were synthesized and 
tested for 125 l -HIV- Ilai gpl2 0 binding. Each binding value 
presented is the average of two separate synthesis and 



binding experiments. Relative percentage of Control = 
{[(mean counts/Control counts)] x 100%} ± average 
deviation. Background counts (no peptide, see Example 7) 
were subtracted from all values. Data for CCR5 are 
presented in Panel A; data for CXCR4 are presented in Panel 
B; and data for STRL33 are presented in Panel C. 

Panel A. 125 I -HIV- 1^ gpl20 binding to N-terminal peptide 

variants of CCR5 

CCR5 variant peptides Relative % of SEQ ID 
(1-18) Control a NO: 



Control 


MD YQVS S P I YD I NY YTS E 


100 






448 


MIA 


AD YQVS S P I YD I NY YTS E 


167 


± 


4 


449 


D2A 


MAYQVSSPIYDINYYTSE 


125 


± 


8 


450 


Y3A 


MDAQVS S P I YD I NYYTSE 


51 


± 


2 


451 


Q4A 


MDYAVSSPIYDINYYTSE 


104 


± 


7 


452 


V5A 


MD YQAS S P I YD I NY YT S E 


82 


± 


3 


453 


S6A 


MD YQ VA S P I YD I NY YT S E 


124 


± 


3 


454 


S7A 


MD YQ VS AP I YD I NY YT S E 


56 


± 


2 


455 


P8A 


MD YQ VS S A I YD I NY YT S E 


157 


± 


2 


456 


I9A 


MD YQ V S S PA YD I N Y YT S E 


24 


± 


7 


.457 


Y10A 


MD YQ V S S P I AD I NY YT S E 


19 


± 


6 


458 


D11A 


MD YQVS S P I YA I NY YTS E 


63 


± 


22 


459 


I12A 


MD YQVS S P I YDANY YTS E 


14 


± 


1 


460 


N13A 


MD YQVS S P I YD I AY YTS E 


253 


± 


19 


461 


Y14A 


MD YQVS S P I YD I NAYTSE 


15 


± 


0.3 


462 


Y15A 


MD YQVS S P I YD I NY ATS E 


21 


± 


5 


463 


T16A 


MD YQVS S P I YD I NY YAS E 


78 


± 


34 


464 


S17A 


MD YQVS S P I YD I NY YTAE 


64 


± 


6 


465 


E18A 


MD YQVS S P I YD I NYYTS A 


4 


± 


2 


466 



a The percent binding for the wild-type peptide was 
defined as 100%. 



47 



Panel B I -HIV- Ilai gpl20 binding to N-terminal peptide 

variants of CXCR4 : 

CXCR4 variant peptides Relative % of SEQ ID 
(1-18) Control a NO: 



Control 


MEGISIYTSDNYTEEMGS 


100 








467 


MIA 


AEG I S I YTSDNYTEEMGS 


118 


± 


18 




468 


E2A 


MAG I S I YTSDNYTEEMGS 


36 


± 


0. 


3 


469 


G3A 


ME A I S I YTSDNYTEEMGS 


101 


± 


3 




470 


I4A 


MEGAS I YTSDNYTEEMGS 


6 


± 


0. 


3 


471 


S5A 


MEGIAI YTSDNYTEEMGS 


133 


± 


5 




472 


I6A 


MEGI SAYTSDNYTEEMGS 


2 


± 


1 




473 


Y7A 


MEGI S I ATSDNYTEEMGS 


7 


± 


0. 


4 


474 


T8A 


MEGI S I YASDNYTEEMGS 


97 


± 


10 




475 


S9A 


MEGI S I YTADNYTEEMGS 


70 


± 


4 




476 


D10A 


MEGI S I YTSANYTEEMGS 


71 


± 


8 




477 


N11A 


MEGI S I YTSDAYTEEMGS 


38 


± 


0. 


4 


478 


Y12A 


MEGI S I YTSDNATEEMGS 


28 


± 


2 




479 


T13A 


MEGI S I YTSDNYAEEMGS 


70 


± 


6 




480 


E14A 


MEGI S I YTSDNYTAEMGS 


72 


± 


1 




481 


E15A 


MEGI S I YTSDNYTEAMGS 


56 


± 


7 




482 


Ml 6 A 


MEGI S I YTSDNYTEEAGS 


88 


± 


4 




483 


G17A 


MEGI S I YTSDNYTEEMAS 


68 


± 


8 




484 


S18A 


MEGI S I YTSDNYTEEMGA 


79 


± 


1 




485 



a The percent binding for the wild-type peptide was 
defined as 100%. 
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Panel C 125 I -HIV- Ilai gpl20 binding to N- terminal 
peptide variants of STRL3 3 



STRL33 variant Relative % of SEQ ID 
peptides (21-38) Control 5 NO : 



Control 


EEHQAFLQFSKVFLPCMY 


100 








486 


E21A 


AEHQAFLQFSKVFLPCMY 


81 


+ 


2 




487 


E22A 


EAHQAFLQFSKVFLPCMY 


70 


+ 


1 




488 


H23A 


EEAQAFLQFSKVFLPCMY 


99 


+ 


1 




489 


Q24A 


EEHAAFLQFSKVFLPCMY 


72 


+ 


1 




490 


A2 5 A 


EEHQAFLQFSKVFLPCMY 


101 


+ 


1 




4 91 


F2 6A 


EEHQAALQFSKVFLPCMY 


32 


+ 


0 . 


1 


492 


L2 7A 


EEHQAFAQFSKVFLPCMY 


37 


+ 


2 




493 


Q2 8A 


EEHQAFLAFSKVFLPCMY 


44 


± 


0. 


4 


494 


F2 9A 


EEHQAFLQASKVFLPCMY 


20 


± 


1 




495 


S3 OA 


EEHQAFLQFAKVFLPCMY 


92 


± 


2 




496 


K31A 


EEHQAFLQFSAVFLPCMY 


162 


± 


2 




497 


V32A 


EEHQAFLQFSKAFLPCMY 


51 


± 


3 




498 


F3 3A 


EEHQAFLQFSKVALPCMY 


45 


± 


2 




499 


L34A 


EEHQAFLQFSKVFAPCMY 


76 


± 


1 




500 


P3 5A 


EEHQAFLQFSKVFLACMY 


82 


± 


3 




501 


C3 6A 


EEHQAFLQFSKVFLPAMY 


53 


± 


5 




502 


M3 7A 


EEHQAFLQFSKVFLPCAY 


112 


± 


4 




503 


Y3 8A 


EEHQAFLQFSKVFLPCMA 


83 


± 


2 




504 


a The 


percent binding for the 


wild- 


type 


peptide 


was 


defined 


as 100%. 













Example 7 

This example demonstrates that the binding of HIV-1 
gpl2 0 envelope protein to the polypeptides of the present 
invention and to the chemokine receptors from which the 
present inventive polypeptides were originally derived or 
inspired is conserved across the various species of HIV-1. 
This example also demonstrates that a step subsequent to 
initial binding of gpl20 to CCR5 , CXCR4 , STRL3 3 , and CD4 is 
the most likely source of the phenomenon of host -range 
selectivity. Additionally, this example demonstrates that 
the underlying method is accurate in that receptor variants 
that are predicted^ to have an altered affinity for binding 
with gpl20, do in fact have a statistically similar 
alteration in affinity where comparable changes in the 
receptors have been identified in other work and the 
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affinity for binding of gpl20/effect on infectivity has 
been measured. 

This example examines the effect of particular 
mutations of CCR5 that were studied in the work underlying 
5 the present invention and that were also studied by other 
artisans in the field. 

The following table identifies a mutation in the first 
column. The first letter designates the wild- type amino 
acid present at the position indicated by the number, and 

10 the letter A which terminates all entries in the first 
column indicates that the amino acid residue present in 
that position in the mutant polypeptide is alaninyl . For 
example, the first data row (i.e., the second row of the 
table) contains the entry Y3A in the first column, which 

15 indicates that the tyrosine residue at position 3 of the 
wild-type CCR5 is substituted by an alanine residue. 

The second column provides the percentage of binding 
exhibited by a mutant polypeptide compared to a wild-type 
polypeptide, when the methods used to elucidate the present 

20 invention are used in conjunction with radiolabeled HIV-Il^ 
gpl2 0 envelope protein. The third through seventh columns 
provide similar data that have been extracted from the work 
of others in the field using a strain of HIV-1 virus 
indicated at the top of each column. For example, row 2 of 

25 the following table indicates that when the mutation Y3A is 
effected in the human CCR5 chemokine receptor, then the 
resulting CCR5 polypeptide has 51.4% of the ability to bind 
HIV- Ilai gpl2 0 envelope protein in comparison to an 
equivalent wild-type peptide. Similarly, HIV-Iada binds to 

30 the mutant polypeptide with 79% of the affinity of a 
non-mutated CCR5 chemokine receptor. 
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gpl2 0 


YU2 


ADA 


JF-RL 


89.6 


DH123 


Y3A 


51.4 


n/a 


79 


82 


n/a 


42 


Q4A 


104 


85 


132 


111 


67 


105 


Y10A 


19.2 


2 


50 


26 


10 


3 


D11A 


62.8 


2 


27 


22 


6 


3 


Y14A 


14 . 6 


12 


47 


25 


6 


0 


Y15A 


21 


30 


3 


3 


1 


0 


E18A 


4.1 


45 


12 


12 


3 


10 



Statistical analysis of these data indicates that the 
similarity between the binding affinity of each mutant 
5 peptide for gpl2 0 elucidated in this study is not more than 
about 25% likely to be causally unrelated to the effects 
observed for YU2 , and not more than about 4% likely to be 
causally unrelated to the effects observed for each of the 
other viruses listed in the table above. 

10 Additionally, the affinity measurements generated by 

the underlying technique has been demonstrated to be 
accurate by (repetitively) showing that antibodies that 
specifically bind to radiolabeled gpl20 are capable of 
preventing the binding of gpl2 0 to polypeptides that have 

15 shown high affinity for binding with gpl2 0 in the 

experiments upon which the present invention is predicated. 
Thus, this example shows that the binding with chemokine 
receptors HIV-1 can be inhibited by the present inventive 
polypeptides, irrespective of the strain of HIV-1 from 

2 0 which the gpl2 0 protein is obtained. 

Example 8 

This example provides a characterization of the 
critical amino acids in the amino- terminal segments of 
25 CCR5, CXCR4, and STRL33 that are essential for the ability 
of these polypeptides to bind with gpl20. 

In this example, the effect on binding that occurs to 
due successive replacement of each amino acid with alanine 
is indicated, wherein a (+) signifies a decrease in binding 
30 affinity and a (>) signifies an enhancement in binding 
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affinity. As is clear from inspection, the sequences are 
shown with that amino- terminus at top and the carboxyl- 
terminus at bottom. 



CCR5 (1-18) 


CXCR4 (1-18) 


STRL3 3 (21-38) ' 


M> 


M 


E 






E 


Y+ + 




H 


O 


I +++++ 




V 


S> 


A 


s 


I ++++++ 


F+ + + 


s+ 


Y+++++ 


L+ + 


p> 


T 


9+ 


I+++ 


S + 


F+ + + 


Y+ + + 


D+ 


S 


D+ 


N+ + 


K> 


I + + + + 


Y+ + 


V+ 


N> 


T 


F+ 


Y+ + + + 


E 


L 


Y+ + + 


E+ + 


P 


T 


M 


C+ 


S + 


G 


M 


E+++++ 


S 


Y 



Example 9 

This example employs the same technique as Example 4 
and provides information similar to that available from 
Example 4 . 

10 The data below compares the ability of synthetic 

fragments of CD4 to bind to labeled gpl20. 9-mer, 12-mer, 
15-mer, 18-mer, and 21-mers were selected based on the data 
from Examples 4. The relative binding affinities of each 
group of polypeptides can be determined by inspection of 

15 the number of counts of radiolabeled gpl2 0 that were 

retained by each N-mer. Data supporting these conclusions 
are provided by Examples 10 and 11. 
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Peptide 




gpl20 


SEQ 


Peptide 




Gpl20 


SEQ 


starting 


Active Peptides 


bound 


ID 


starting 


Active Peptides 


Bound 


ID 


position # 




(counts) 


NO: 


position # 




( count s ) 


NO: 




ACTIVE 9 - MERS 








ACT I VE 1 2 - MERS 






105 


DTYICEVED 


1043 


505 


101 


I EDSDTY I CEVE 


1107 


530 


115 


KEEVQLLVF 


1273 


506 


112 


EDQKEEVQLLVF 


1379 


531 


116 


EEVQLLVFG 


3170 


507 


113 


DQKEEVQLLVFG 


1624 


532 


117 


EVQLLVFGL 


2146 


508 


114 


QKE EVQLLVFGL 


1785 


533 










115 


KEEVQLLVFGLT 


1774 


534 










116 


EEVQLLVFGLTA 


3261 


535 










117 


EVQLLVFGLTAN 


1838 


536 










133 


LLQGQS LTLTLE 


1320 


537 


217 


EQVEFSFPL 


1032 


509 


215 


EGEQVEFSFPLA 


1456 


538 


218 


QVEFSFPLA 


1205 


510 


216 


GEQVEFS FPLAF 


1729 


539 


219 


VEFSFPLAF 


1064 


511 


217 


EQVEFSFPLAFT 


1556 


540 










218 


QVEFS FPLAFTV 


1636 


541 




ACTIVE 15 -MERS 








ACTIVE 18 -MERS 






109 


CEVEDQKEE VQLLV F 


1729 


512 


105 


DTY I CE VEDQKE EVQLLV 


1648 


542 


110 


EVEDQKEEVQLLVFG 


2805 


513 


106 


TYICEVEDQKEEVQLLVF 


3794 


543 


111 


VEDQKE EVQLLVFGL 


3816 


514 


107 


Y I C EVEDQKEEVQLLVFG 


4611 


544 


112 


EDQKEEVQLLVFGLT 


3633 


515 


108 


I CEVEDQKE EVQLLVFGL 


3898 


545 


113 


DQKEEVQLLVFGLTA 


3905 


516 


109 


CEVEDQKEEVQLLVFGLT 


3797 


546 


114 


QKEEVQLLVFGLTAN 


3770 


517 


110 


E VEDQKE EVQLLVFGLTA 


3647 


547 


115 


KEEVQLLVFGLTANS 


3485 


518 


111 


VEDQKEEVQLLVFGLTAN 


3913 


548 


116 


EE VQLLV FGLTANSD 


6423 


519 


112 


EDQ KEEVQLLVFGLTANS 


3416 


549 


117 


EVQLLVFGLTANSDT 


2689 


520 


113 


DQKEEVQLLVFGLTANSD 


3317 


550 










114 


QKEEVQLLVFGLTANSDT 


3671 


551 


130 


DTHLLQGQSLTLTLE 


1622 


521 


127 


ANSDTHLLQGQSLTLTLE 


1540 


552 


131 


THLLQGQSLTLTLES 


1874 


522 


128 


NSDTHLLQGQSLTLTLES 


1726 


553 


132 


HLLQGQSLTLTLES P 


12 77 


523 


12 9 


nnmTTT'T A/*i/\fiT rri T rriT nrin 

SDTHLLQGQSLTLTLESP 


1260 


5 54 


213 


KKEGEQVE FS FPLAF 


1921 


524 


210 


IVYKKEGEQVEFSFPLAF 


5382 


555 


214 


KEGEQVEFSFPLAFT 


3253 


525 


211 


VYKKEGEQVEFSFPLAFT 


4307 


556 


215 


EGEQVE FS FPLAFTV 


3270 


526 


212 


YKKEGEQVEFS FPLAFTV 


4839 


557 


216 


GEQVEFSFPLAFTVE 


4656 


527 


213 


KKEGEQVE FSFPLAFTVE 


4683 


558 


217 


EQVE FS FPLAFTVEK 


4135 


528 


214 


KEGEQVE FS F PLAFTVE K 


3117 


559 




yvcif of f i-Lt\r i v Ci jvjj 




d £ y 




T TfZT?OVI? T rQ T ?DT.ZV'E , T , \7I?'K'T. 
Cj\jHi\J V UtT Of rLirtf J. VEiXvU 


Z X D *± 


560 










216 


GEQVEFSF PLAFTVE KLT 


1643 


561 




ACTIVE 21 -MERS 














90 


GNFPL I I KNLKI EDSDTY ICE 


5248 


562 










91 


NFPLI I KNLKI EDSDTYI CEV 


7803 


563 










92 


F PL 1 1 KNLKI EDSDTY I CEVE 


13919 


564 











53 



93 


PLI IKNLKIEDSDTYICEVED 


20145 


565 


94 


LI I KNLKI EDSDTYI CEVEDQ 


17108 


566 


95 


I I KNLKI EDSDTY I CEVEDQK 


11892 


567 


96 


I KNLKI EDSDTYI CEVEDQKE 


15073 


568 


97 


KNLKI EDSDTYI CEVEDQKEE 


8789 


569 


99 


LKI EDSDTYI CEVEDQKE EVQ 


5519 


570 


100 


KI EDSDTYI CEVEDQKE EVQL 


6325 


571 


101 


I EDSDTY I CEVEDQKE EVQLL 


12064 


572 


102 


EDS DT Y I CEVEDQ KEEVQLLV 


4933 


573 


103 


DSDTY I CEVEDQKE EVQLLVF 


30277 


574 


104 


SDTY I CEVEDQKEE VQLLVFG 


30319 


575 


105 


DTYI CEVEDQKE EVQLLVFGL 


25424 


576 


106 


TYI CEVEDQKE EVQLLVFGLT 


20191 


577 


107 


Y I CEVEDQKEE VQLLVFGLTA 


22884 


578 


108 


I CEVEDQKE EVQLLVFGLTAN 


7276 


579 


109 


CEVEDQ KEEVQLLVFGLTANS 


3517 


580 


123 


FGLTANSDTHLLQGQSLTLTL 


11529 


581 


124 


GLTANSDTHLLQGQSLTLTLE 


14065 


582 


125 


LTANSDTHLLQGQSLTLTLES 


17113 


583 


126 


TANSDTHLLQGQSLTLTLESP 


23595 


584 


204 


FQKASSIVYKKEGEQVEFSFP 


9382 


585 


205 


QKASSIVYKKEGEQVEFSFPL 


24959 


586 


206 


KASSIVYKKEGEQVEFSFPLA 


30873 


587 


207 


ASSIVYKKEGEQVEFSFPLAF 


25146 


588 


208 


SS I VYKKEGEQVE FSFPLAFT 


28068 


589 


209 


S I VYKKEGEQVEFS FPLAFTV 


8165 


590 


210 


I VYKKEGEQVE FS F PLAFTVE 


15620 


591 


221 


FS F PLAFTVE KLTGS GE LWWQ 


4163 


592 


222 


S F PLAFTVE KLTGS GELWWQ A 


2284 


593 


223 


FPLAFTVEKLTGSGELWWQAE 


6276 


594 


224 


PLAFTVEKLTGSGELWWQAER 


2647 


595 


225 


LAFTVE KLTGS GE LWWQ AERA 


3577 


596 



Example 10 

This example provides data which enables those skilled 
in the art to arrive at the conclusions indicated in 
5 Examples 9 and 12. In this example, the counts of 

radiolabeled gp-120 retained by each peptide indicated in 
the left hand column are given in the right hand column. 
The first panel (panel A) provides data for 21-mers of CD4 . 



Panel A 
PEPTIDE 



LWDQGNFPLI IKNLKIEDSDT 
WDQGNFPLI I KNLKI EDSDTY 
DQGNFPLI IKNLKIEDSDTYI 
QGNFPLI IKNLKIEDSDTYIC 
GNFPLI IKNLKIEDSDTYICE 
NFPLI IKNLKIEDSDTYICEV 
FPL I I KNLKI EDSDTY I CEVE 
PLI I KNLKI EDSDTYI CEVED 
L I I KNLKI EDSDTY I CEVEDQ 
I I KNLKI EDSDTYI CEVEDQK 
I KNLKI EDSDTYI CEVEDQKE 
KNLKI EDSDTY I CEVEDQKEE 
NLKI EDSDTYI CEVEDQKEEV 
LKIEDSDTYICEVEDQKEEVQ 
KI EDSDTYI CEVEDQKEEVQL 
I EDSDTY I CEVEDQKEEVQLL 
EDSDTYI CEVEDQKEEVQLLV 
DSDTY I CEVEDQKEE VQLLVF 
SDTY I CEVEDQKE EVQLLVFG 
DTYICEVEDQKEEVQLLVFGL 
TY I CEVEDQKEE VQLLVFGLT 
Y I CEVEDQKEE VQLLVFGLTA 
I CEVEDQKEEVQLLVFGLTAN 
CEVEDQKEEVQLLVFGLTANS 
EVEDQKEEVQLLVFGLTANSD 
VEDQKEEVQLLVFGLTANSDT 
EDQKEEVQLLVFGLTANSDTH 
DQKEEVQLLVFGLTANSDTHL 
QKEEVQLLVFGLTANSDTHLL 
KEEVQLLVFGLTANSDTHLLQ 
EEVQLLVFGLTANSDTHLLQG 
EVQLLVFGLTANSDTHLLQGQ 
VQLLVFGLTANSDTHLLQGQS 
QLLVFGLTANS DTHLLQGQ S L 
LLVFGLTANSDTHLLQGQSLT 
LVFGLTANSDTHLLQGQSLTL 
VFGLTANSDTHLLQGQSLTLT 
FGLTANSDTHLLQGQSLTLTL 
GLTANSDTHLLQGQSLTLTLE 
LTANSDTHLLQGQSLTLTLES 
TANSDTHLLQGQSLTLTLESP 
Empty (Control) 
TWTCTVLQNQKKVEFKI DIW 
WTCTVLQNQKKVEFKIDIWL 
TCTVLQNQKKVE F KI D I WLA 



COUNTS SEQ ID 
NO: 



731 


597 


889 


598 


1138 


599 


2242 


600 


5248 


601 


7803 


602 


13919 


603 


20145 


604 


17108 


605 


11892 


606 


15073 


607 


8789 


608 


2016 


609 


5519 


610 


6325 


611 


12064 


612 


4933 


613 


30277 


614 


30319 


615 


25424 


616 


20191 


617 


22884 


618 


7276 


619 


3517 


620 


1687 


621 


646 


622 


562 


623 


599 


624 


573 


625 


682 


626 


690 


627 


589 


628 


1099 


629 


2057 


630 


860 


631 


4677 


632 


2762 


633 


11529 


634 


14065 


635 


17113 


636 


23595 


637 


515 




1430 


638 


1616 


639 


1092 


640 
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CTVLQNQKKVE FKI D I WLAF 


2909 


641 


TVLQNQKKVEFKIDIWLAFQ 


3273 


642 


VLQNQKKVE FKI D I WLAFQK 


1323 


643 


LQNQKKVEFKI D I WLAFQKA 


1256 


644 


QNQKKVEFKI D I WLAFQKAS 


1808 


645 


NQKKVEFKIDIWLAFQKASS 


1507 


646 


QKKVEFKI D I WLAFQKAS S I 


759 


647 


KKVEFKI D I WLAFQKAS S I V 


782 


648 


KVEFKIDI WLAFQKAS SIVY 


635 


649 


VEFKIDIWLAFQKASSIVYK 


725 


650 


EFKIDIWLAFQKASSIVYKK 


649 


651 


FKI D I WLAFQKAS S I VYKKE 


593 


652 


KIDIWLAFQKASSIVYKKEG 


1394 


653 


I D I WLAFQKAS S I VYKKEGE 


962 


654 


DIWLAFQKASSIVYKKEGEQ 


788 


655 


I WLAFQKAS S I VYKKEGEQ V 


646 


656 


WLAFQKAS S I VYKKEGEQVE 


772 


657 


VLAFQKAS S I VYKKEGEQVEF 


1793 


658 


LAFQKASSIVYKKEGEQVEFS 


1410 


659 


AFQKASSIVYKKEGEQVEFSF 


3775 


660 


FQKAS SI VYKKEGEQ VEFSFP 


9382 


661 


QKASSIVYKKEGEQVEFSFPL 


24959 


662 


KASSIVYKKEGEQVEFSFPLA 


30873 


663 


AS S I VYKKEGEQVE F S FPLAF 


25146 


664 


SSIVYKKEGEQVEFSFPLAFT 


28068 


665 


SIVYKKEGEQVEFSFPLAFTV 


8165 


666 


I VYKKEGEQVE F S F PLAFTVE 


15620 


667 


VYKKEGEQVEFSFPLAFTVEK 


2429 


668 


YKKEGEQVEFSFPLAFTVEKL 


735 


669 


KKEGEQVEFSFPLAFTVEKLT 


1847 


670 


KEGEQVEFSFPLAFTVEKLTG 


972 


671 


EGEQVEFSFPLAFTVEKLTGS 


739 


672 


GEQVEFSFPLAFTVEKLTGSG 


652 


673 


EQVEFSFPLAFTVEKLTGSGE 


765 


674 


QVEFSFPLAFTVEKLTGSGEL 


741 


675 


VEFSFPLAFTVEKLTGSGELW 


633 


676 


EFSFPLAFTVEKLTGSGELWW 


681 


677 


FSFPLAFTVEKLTGSGELWWQ 


4163 


678 


SFPLAFTVEKLTGSGELWWQA 


2284 


679 


FPLAFTVEKLTGSGELWWQAE 


6276 


680 


PLAFTVEKLTGSGELWWQAER 


2647 


681 


LAFTVEKLTGSGELWWQAERA 


3577 


682 


AFTVEKLTGSGELWWQAERAS 


1739 


683 


Empty (control) 


617 
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These second and third panels (panels B and C) provide data 
for 18-mers of a small region of CD4 . 

Panel B 



PEP 1 1 JJE 


COUNTS 


SEQ ID NO: 


T TiTT^ /~\ f~* XT TP T^i T T TT/XTT XT 


502 


684 


TYTT^ XT "C T"i T T T T/TVTT "W T 

WJjyCjJMr PXjI IKNLiKl 


534 


/—Of - 

685 


■p\ /"*\ /~1 "NT TT» Ti T T T WTT TS~ T 


635 


6 86 


yvjJNr PXjX 1 KJnXjKX ED 


509 


6 8 7 


r i KTT?"DT TT VTVTT VTUnC 


C O A 


a o o 

ooo 


TvTCDT T T VTvTT VTPncn 


C C A 

6b4 


£7 0 0 

6 o y 


r ir J_i± X JSJ>JXjJN.X.CiiJolJ I 


53 9 


/T O C\ 

6 9 U 


DT T T VTvTT VTUHCnTV 

PXjX X J\i\IXji\.X EJJoJJ X x 


661 


/TOT 

6 91 


T T T VTvTT VTUnCHTVT 
XjX XlSJNXjJVXrjJJibJJX X X 


b4 z 


/TOO 


T T VTvTT VTDnCnTVTP 


CCA 

664 


6 93 


X rUNXjKXEDbDX llLh 


568 


6 94 


VTvTT VTUROnT'VTPDTr 


^ c o 

562 


c\ rz 

6 9b 


TvTT VT^nCnTVTPUUI? 
JNXjJS.Xrj]JolJX I X Cfci y/Vj 


X X6 U 


6 y 6 


XjJvX EDbDX x X CEVED 


84 6 


f O *~7 

6 9/ 


vt cncr\TVT r^Trwcr^/^s 

jvxiiiJoiJ x x xujti viiijy 


1 C\ O Q 

lOoo 


o o 

6 y o 


XcjXJoUX x XCliVnjXJyJv 


1 i / *5 

114 3 


/— o o 

6 9 9 


EDoUX x ILei VEJjyjtvE 


815 


•~7 O C\ 


JJoUX x XCE VEJJylvEE 


y i6 


/ 0 1 


oJJX x XCEVEDyrvJiE V 


C\ o o 

993 


TOO 

7 02 


Uli XCEVEDyKEEVy 


10 71 


7 03 


X x X(_E VEDyJvfciE VC^Xj 


n rr /- 

956 


/ 04 


x X C hi V iiJjy KE E VCJXjXj 


1064 


n A C 

/0b 


X U Ji V hjUy K±j hi XjXj V 


1084 


/ 0 6 


CEVEDQKEEVQLIjVF 


172 9 


7 0 7 


li V JiUy JSJiii v y XjXj V r Lx 


o q n c 
zoU j 


/ U o 


T7T?nnvi7u\7nT t \ 7T7"ot 
VrLJjyxviiii N/yXjXj Vr CjXj 


3 O 16 


^7 n o 

/ u y 


TTT^O VCTTT/Z^T T \7'U > t~ t T TP 

EJjyJvEE vyXjXjVr CjXj X 


^ 1 

-3 63 3 


/ 1 u 


U\2 JMi Cj V y XjXj V r CjXj X A 


*D Q A C 

3 9 0b 


/ll 


PVTTUUHT T T7T7PT T 7\ "NT 

yjNJiii, vyXjXj Vr CjXj X AJNI 


3 / /0 


/ 1 z 


VT?T?T7nT T T TTPr^T TAMO 

JvEE V CJXjXjVr CjXjXAJno 


O /I o c 

34 8 5 


/ 1 3 


EEVQELVFGLTANSD 


6423 


714 


EVQLLVFGLTANSDT 


2689 


715 


VQLLVFGLTANSDTH 


1006 


716 


QLLVFGLTANSDTHL 


865 


717 


LLVFGLTANSDTHLL 


599 


718 


LVFGLTANSDTHLLQ 


609 


719 


VFGLTANSDTHLLQG 


532 


720 


FGLTANSDTHLLQGQ 


625 


721 


GLTANSDTHLLQGQS 


532 


722 


LTANSDTHLLQGQSL 


634 


723 


TANSDTHLLQGQSLT 


513 


724 


ANSDTHLLQGQSLTL 


542 


725 
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inoU x rixjXjyijy blj X xj 1 


oil 


"~7 o a 
/ZD 


CFlTT-TT T HPHOT T'T TT 
oU ± rlijijy O J_l 1 J_i 1 Xj 


*"7 /I T 


TOT 

/2 / 


JJ 1 rlJ_iijyvjy bJ_i 1 -b 1 Xjli 


i too 


"7 O O 

/ 2 o 




Xo /4 


-7 O Q 

/ 2 y 


TTT T ApAHT rp-r rpy riOD 

nij Lj(J b J_i 1 Li 1 ij hj o F 




*~7 O r\ 

/JO 


J_iWJJ(Jkji\lr Fill 1 KJNI1jKXx!jJJ 


con 
582 


731 


WJjyvjjJNlr FLil IKlNILiKliiUb 




/J 2 


JjyCjrJNJr Fij± 1 JMNlijKliilJoU 


59 8 


73 3 


yLxNr Fill ± JSJNijJ\XcjlJoJJ 1 


564 


/ 3 4 


KjNr Fiji X JSJNIliJvXrjJJbJJX x 


c c ^ 

557 


/3 5 


JNIr FliX X lUNxjlxXrjDbJJX x X 


c O *"? 

62 / 


/3 6 


r FxjX X JNXNxjJN.XCiJJox-/X ill 


C A O 

buy 


7 "2 *"7 
/3 / 


PT . T T TCMT .V T WTl QFlTV T C*T? 
±r XjX. X xvLVIXjIvX EjJJoJJ 1 IILd 


O ^ 


*7 "3 ft 
/JO 


XjX X JSXVlXjJVXxliJJolJX X Xl^rj V 


£T *3 /l 


-7 "2 Q 

/ 3 y 


X X JMNXjIN-X HjUoxJ 1 X X V_ii V Hi 


/ 5 X 


/ 4 U 


X iNXSIXjIvX JiUoXJ X x XL.XliVxljU 


by y 


•"7 1 

/ 4 X 


xxINJXjivX DUoU 1 X XL^IiVliiXjy 


■7 0 0 

/ U o 


1 A O 
/ 4 Z 


iNXjrvX CjUoU 111 L*rj VxtiXJ^iv 


ODJ 


1 A "3 

/ 4 J 


Xj IV X JZj U O iJ 1 X X VXliXjyiVCj 


o / Z 


7/1/1 


xMCjL'OL' 1 X X K^d V XZjXJ^ JVCjIZj 


QCQ 

o D c5 


7 A CC 
/ 4 D 


ICjUoUI X 1LD VHiXJU JXXIjCj V 




7 zi a 
/4b 


"FFl Q FIT V T r 1 T7\ 7T7 nn VE? 70 


TOO 

/ o o 


*7 Zl T 
/ 4 / 


Fi Q Fi T V T r 1 T? A 7TT Fi O V T7 ^ 7P* T 
XJoXJX X X V CjXJyiSJiXlj VyJj 


yol 


7/1 Q 

/ 4 o 


OU1 X X K~.ni V HiJJ^JVIZiIZi V 1*7 XjXj 


o / U 


7/1 Q 

/ 4 y 


U 1 X X U-Cj V liXJ^JVCjXj V \2 XjXj V 


X D4 t5 


/ jU 


TVT r , T?^7'K 1 FlOVT5 1 T7\70T T 

X x Xs^lL ViiUy*\X2'C' VyXjXj Vr 


O / j?4 


7C1 
/ 3 X 


VT r , "C 1 \7T?FlOTr'C , l?\70T T \7T?H 
X X v^Jli v XjU^ xvcjli V yXjXj V r \j 


/l 1 
4 D X X 


7 

/ z> z 


XV^Hi V tLUy^r^CiCi VvJjJjvr VjXj 


OQQQ 


7CT 


^Lti V £ZjUV'*^- j 'L- 1 V vXjXj Vr kjrXj X 


o / / 


7 zl 


IL V HiU^rvHiHj V v^XjXj Vr VjjXj X A 


*} ^ A *7 
O D 4 / 


7 R CI 


\ 717 FiOVT? 17^701 T \7T7P'T T7AM 
V XliX^yiSJiXlj V V^XjXj Vr VjtXj X/\iM 




7CC 
/DO 


JZjJJ^JXJZjJZj VyLLVr vjXj X/\IM o 




7C7 
/ 3 / 


xj^jviirj vyxjXj Vr vjXj x/HMoxJ 


o 3 X / 


7 c; p 

/JO 


^jvclil v ^xjXj v r vjjXj i/hndjj x 


JO / 1 


7CQ 
/ 3 -7 


A. XL XL V^XjXj V r LjXj XAiNoU X It 


1 oil 
X Z / X 


7 <^ n 

/OU 


rjHj V vXjXj V r kjXj XiAXNoxJ X rlxj 


7Q*1 


7 <^ 1 
/ O 1 


Ct V^XjXj V r VaXj Xi-UNoXJ X xIXjXj 


£ £ *7 


7^9 
/ O Z 


V^XjXj V r LjXj IHiNoJJ X IIXjXj^ 


£ *7 *i 


7 "3 
/DO 


^XjXj Vr VjxjX/\iNoxJXrlxjXjykj 




H A 
/ D *± 


XjXj V r VjjXj X /-UN OXJ X xlXjXj^JVjr^ 


CCD 


7 ^ 
/ O 3 


Xj V r LjXj XAxNoJJ X iIXjXjv^ Vjj^o 


o o 4 


/OO 


Vr ^jjXj Xi-UN oU X xIXjXjv^Ljv^oIj 


jjI 


7C7 

/ b / 


r CaxjXANbDlrlijLiyvjyoXiX 


5 y x 


7 q 

/DO 


GLTANSDTHLLQGQSLTL 


572 


769 


LTANSDTHLLQGQSLTLT 


528 


770 


TANSDTHLLQGQSLTLTL 


891 


771 


ANSDTHLLQGQSLTLTLE 


1540 


772 


NSDTHLLQGQSLTLTLES 


1726 


773 
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SDTHLLQGQSLTLTLESP 
Empty (control) 

Panel C 
PEPTIDE 

WTCTVLQNQKKVEFK 
TCTVLQNQKKVEFKI 
CTVLQNQ KKVE F KI D 
T VLQNQ KKVE F K I D I 
VLQNQKKVEFKIDIV 
LQNQKKVEFKIDIW 
QNQKKVEFKIDIWL 
NQKKVEFKIDIWLA 
QKKVEFKIDIWLAF 
KKVEFKIDIWLAFQ 
KVEFKIDIWLAFQK 
VEFKIDIWLAFQKA 
EFKIDIWLAFQKAS 
FKI D I WLAFQKAS S 
KIDIWLAFQKASSI 
I D I WLAFQKAS S I V 
D I WLAFQKAS SIVY 
I WLAFQKAS SI VYK 
WLAFQKAS S I VYKK 
VLAFQKASS IVYKKE 
LAFQKAS S I VYKKEG 
AFQKAS S I VYKKEGE 
FQKASS I VYKKEGEQ 
QKASSIVYKKEGEQV 
KAS S I VYKKEGEQVE 
ASSIVYKKEGEQVEF 
SSI VYKKEGEQVE FS 
SIVYKKEGEQVEFSF 
IVYKKEGEQVEFSFP 
VYKKEGEQVEFSFPL 
YKKEGEQVEFSFPLA 
KKEGEQVEFSFPLAF 
KEGEQVEFSFPLAFT 
EGEQVEFSFPLAFTV 
GEQVEFSFPLAFTVE 
EQVEFSFPLAFTVEK 
QVEFSFPLAFTVEKL 
VEFSFPLAFTVEKLT 
EFSFPLAFTVEKLTG 
F S F PLAFT VEKLTGS 
SFPLAFTVEKLTGSG 



1260 774 
575 



COUNTS SEQ ID NO: 



566 


775 


510 


776 


608 


777 


587 


778 


605 


779 


644 


780 


636 


781 


860 


782 


1333 


783 


951 


784 


1051 


785 


1005 


786 


1188 


787 


1001 


788 


956 


789 


865 


790 


776 


791 


783 


792 


577 


793 


634 


794 


593 


795 


544 


796 


637 


797 


519 


798 


563 


799 


589 


800 


558 


801 


651 


802 


615 


803 


714 


804 


687 


805 


1921 


806 


3253 


807 


3270 


808 


4656 


809 


4135 


810 


2047 


811 


899 


812 


920 


813 


672 


814 


565 


815 
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FPLAFTVEKLTGSGE 


556 


816 


PLAFTVEKLTGSGEL 


612 


817 


LAFTVEKLTGSGELW 


579 


818 


AFTVEKLTGSGELWW 


586 


819 


FTVEKLTGSGELWWQ 


625 


820 


TVEKLTGSGELWWQA 


550 


821 


VEKLTGSGELWWQAE 


735 


822 


EKLTGSGELWWQAER 


683 


823 


WTC T VLQNQ KKVE F K I D I 


588 


824 


TCTVLQNQKKVEFKIDIV 


571 


825 


CTVLQNQKKVEFKIDIW 


553 


826 


TVLQNQKKVEFKI D I WL 


655 


827 


VLQNQKKVE FK I D I WLA 


724 


828 


LQNQ KKVE F K I D I WLAF 


938 


829 


QNQKKVE FKI D I WLAFQ 


917 


830 


NQKKVEFKI D I WLAFQ K 


889 


831 


QKKVE FKI D I WLAFQKA 


1013 


832 


KKVEFKIDI WLAFQKAS 


912 


833 


KVEFKIDIWLAFQKASS 


1011 


834 


VEFKIDIWLAFQKASSI 


819 


835 


EFKIDIWLAFQKASSIV 


799 


836 


FKI D I WLAFQKAS S I VY 


843 


837 


KIDIWLAFQKASSIVYK 


779 


838 


I D I WLAFQKAS S IVYKK 


711 


839 


D I WLAFQKAS S I VYKKE 


660 


840 


I WLAFQKAS S I VYKKEG 


531 


841 


WLAFQKAS S I VYKKEGE 


560 


842 


VLAFQKASS I VYKKEGEQ 


549 


843 


LAFQKASS I VYKKEGEQV 


665 


844 


AFQKAS S I VYKKEGEQVE 


514 


845 


FQKAS S I VYKKEGEQVE F 


528 


846 


QKAS SI VYKKEGEQ VEFS 


602 


847 


KASSIVYKKEGEQVEFSF 


536 


848 


ASSIVYKKEGEQVEFSFP 


701 


849 


S S I VYKKEGEQ VE F S F PL 


756 


850 


SIVYKKEGEQVEFSFPLA 


771 


851 


IVYKKEGEQVEFSFPLAF 


5382 


852 


VYKKEGEQVEFSFPLAFT 


4307 


853 


YKKEGEQVEFSFPLAFTV 


4839 


854 


KKEGEQVE FS F PLAFTVE 


4683 


855 


KEGEQVEFSFPLAFTVEK 


3117 


856 


EGEQVEFSFPLAFTVEKL 


2164 


857 


GEQVEFSFPLAFTVEKLT 


1643 


858 


EQVEFSFPLAFTVEKLTG 


798 


859 


QVE F S F PLAFTVE KLTGS 


736 


860 


VEFSFPLAFTVEKLTGSG 


533 


861 


EFSFPLAFTVEKLTGSGE 


668 


862 


FS FPLAFTVEKLTGSGEL 


613 


863 
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SFPLAFTVEKLTGSGELW 


656 


864 


FPLAFTVEKLTGSGELWW 


586 


865 


PLAFTVEKLTGSGELWWQ 


650 


866 


LAFTVEKLTGSGELWWQA 


866 


867 


AFTVEKLTGSGELWWQAE 


788 


868 


FTVEKLTGSGELWWQAER 


1143 


869 


Empty (control ) 


556 





The fourth and fifth panels (Panels D and E) provide data 
for select 9-mers and 12-mers of CD4 . 

5 Panel D 



PEPTIDE 


COUNTS 


SEQ ID 






NO : 


T-\ f\ f~\ "KT TP TT T T T 

DQGNFPLI I 


/r /r o 

662 


o t r\ 

8 7 0 


QGNFPLiI IK 


T~ f\ Ci 

508 


8 71 


GNFPLI IKN 


600 


8 72 


■K TTTI T^> T T T T/TVTT 

NFPLI IKNIj 


561 


O ""7 T 

8 73 


T7T&T T T VTVTT V 

r PLiI IKNIjK 


6 01 


8/4 


TTT T T VTVTT V T 

PLiI IKNLiKI 


697 


n t c 

8 7 5 


T T T VTVTT V T TT" 

LI IKNLKIE 


515 


8 7 6 


T T* VTVTT T/TDn 

I IKNLKIED 


658 


8 7 7 


1 KNLiKIfciDb 


IT IT T 

557 


o / o 


VTVTT vTuncn 

KNLKIEDSD 


612 


O *~7 CI 


TVTT T/TDr\CT\fTi 

NLKIEDSDT 


512 


O O A 


LiKIEDSDTY 


4 92 


O O T 

8 81 


KIEDSDTYI 


603 


O O T 

8 8 2 


IEDSDTYIC 


567 


883 


EDSDTYICE 


650 


884 


DSDTYICEV 


712 


885 


SDTYICEVE 


819 


886 


DTYICEVED 


1043 


887 


TYICEVEDQ 


805 


888 


YICEVEDQK 


728 


889 


ICEVEDQKE 


596 


890 


CEVEDQKEE 


555 


891 


EVEDQKEEV 


587 


892 


VEDQKEEVQ 


521 


893 


EDQKEEVQL 


564 


894 


DQKEEVQLL 


589 


895 


QKEEVQLLV 


636 


896 


KEEVQLLVF 


1273 


897 


EEVQLLVFG 


3170 


898 


EVQLLVFGL 


2146 


899 


VQLLVFGLT 


815 


900 


QLLVFGLTA 


822 


901 


LLVFGLTAN 


576 


902 
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LVFGLTANS 

VFGLTANSD 

FGLTANSDT 

GLTANSDTH 

LTANSDTHL 

TANSDTHLL 

ANSDTHLLQ 

NSDTHLLQG 

SDTHLLQGQ 

DTHLLQGQS 

THLLQGQSL 

HLLQGQSLT 

LLQGQSLTL 

LQGQSLTLT 

QGQSLTLTL 

GQSLTLTLE 

DQGNFPLI I KNL 

QGNFPLI I KNLK 

GNFPLI IKNLKI 

NFPLI IKNLKIE 

FPLI IKNLKI ED 

PLIIKNLKIEDS 

LIIKNLKIEDSD 

I IKNLKI EDSDT 

I KNLKI EDSDTY 

KNLKIEDSDTYI 

NLKI EDSDTY I C 

LKIEDSDTYICE 

KI EDSDTY I CEV 

I EDSDTYI CEVE 

EDSDTYICEVED 

DSDTYICEVEDQ 

SDTYICEVEDQK 

DTYICEVEDQKE 

T Y I CEVEDQKEE 

Y I CEVEDQKEEV 

I CEVEDQKEEVQ 

CEVEDQKEEVQL 

EVEDQKEEVQLL 

VEDQKEEVQLLV 

EDQKEEVQLLVF 

DQKEEVQLLVFG 

QKEEVQLLVFGL 

KEEVQLLVFGLT 

E E VQLL VFGLT A 

EVQLLVFGLTAN 

VQLLVFGLTANS 

QLLVFGLTANSD 



522 


903 


549 


904 


563 


905 


481 


906 


596 


907 


554 


908 


642 


909 


561 


910 


526 


911 


578 


912 


512 


913 


564 


914 


568 


915 


501 


916 


594 


917 


777 


918 


604 


919 


533 


920 


547 


921 


647 


922 


511 


923 


565 


924 


619 


925 


511 


926 


574 


927 


523 


928 


639 


929 


635 


930 


601 


931 


1107 


932 


956 


933 


937 


934 


846 


935 


720 


936 


818 


937 


734 


938 


585 


939 


561 


940 


508 


941 


657 


942 


1379 


943 


1624 


944 


1785 


945 


1774 


946 


3261 


947 


1838 


948 


747 


949 


721 


950 



62 



LLVFGLTANSDT 533 951 

LVFGLTANSDTH 586 952 

VFGLTANSDTHL 548 953 

FGLTANSDTHLL 571 9 54 

GLTANSDTHLLQ 574 9 55 

LTANSDTHLLQG 534 956 

TANSDTHLLQGQ 54 9 957 

ANSDTHLLQGQS 559 958 

NSDTHLLQGQSL 585 959 

SDTHLLQGQSLT 54 0 960 

DTHLLQGQSLTL 52 7 961 

THLLQGQSLTLT 646 962 

HLLQGQSLTLTL 701 963 

LLQGQSLTLTLE 1320 964 

Empty (control) 581 



Panel E 



IT Hi IT J. X UHi 










NO : 


TVLQNQKKV 


534 


965 


VLQNQKKVE 


556 


966 


LQNQKKVEF 


565 


967 


QNQKKVEFK 


537 


968 


NQKKVEFKI 


597 


969 


QKKVEFKID 


575 


970 


KKVEFKIDI 


501 


971 


KVEFKIDIV 


555 


972 


VEFKIDIW 


548 


973 


EFKIDIWL 


665 


974 


FKIDIWLA 


568 


975 


KIDIWLAF 


665 


976 


IDIWLAFQ 


691 


977 


DIWLAFQK 


686 


978 


IWLAFQKA 


602 


979 


WLAFQKAS 


600 


980 


VLAFQKASS 


466 


981 


LAFQKASSI 


592 


982 


AFQKASSIV 


595 


983 


FQKASSIVY 


568 


984 


QKASSIVYK 


494 


985 


KASSIVYKK 


498 


986 


ASSIVYKKE 


600 


987 


SSIVYKKEG 


515 


988 


SIVYKKEGE 


566 


989 


IVYKKEGEQ 


534 


990 


VYKKEGEQV 


490 


991 


YKKEGEQVE 


518 


992 



63 



KKEGEQVEF 

KEGEQVEFS 

EGEQVEFSF 

GEQVEFSFP 

EQVEFSFPL 

QVEFSFPLA 

VEFSFPLAF 

EFSFPLAFT 

FSFPLAFTV 

SFPLAFTVE 

FPLAFTVEK 

PLAFTVEKL 

LAFTVEKLT 

AFTVEKLTG 

FTVEKLTGS 

TVEKLTGSG 

VEKLTGSGE 

EKLTGSGEL 

KLTGSGELW 

LTGSGELWW 

TGSGELWWQ 

TVLQNQKKVEFK 

VLQNQKKVEFKI 

LQNQKKVEFKID 

QNQKKVEFKIDI 

NQKKVEFKIDIV 

QKKVEFKIDIW 

KKVEFKIDIWL 

KVEFKIDIWLA 

VEFKIDIWLAF 

E FKI DIWLAFQ 

FKIDIWLAFQK 

KIDI WLAFQKA 

I D I WTjAFQKAS 

D I WLAFQKAS S 

IWLAFQKASSI 

WLAFQKAS S I V 

VLAFQKASSIVY 

LAFQKAS S I VYK 

AFQKAS S I VYKK 

FQKASSIVYKKE 

QKASSIVYKKEG 

KASS I VYKKEGE 

ASSIVYKKEGEQ 

SSIVYKKEGEQV 

S I VYKKEGEQVE 

I VYKKEGEQVE F 

VYKKEGEQVEFS 



546 


993 


595 


994 


735 


995 


697 


996 


1032 


997 


1205 


998 


1064 


999 


658 


1000 


472 


1001 


619 


1002 


569 


1003 


597 


1004 


501 


1005 


517 


1006 


574 


1007 


487 


1008 


585 


1009 


541 


1010 


491 


1011 


550 


1012 


507 


1013 


563 


1014 


503 


1015 


508 


1016 


559 


1017 


532 


1018 


595 


1019 


597 


1020 


560 


1021 


681 


1022 


659 


.. 1023 


736 


1024 


689 


1025 


630 


1026 


746 


1027 


548 


1028 


567 


1029 


548 


1030 


465 


1031 


597 


1032 


577 


1033 


596 


1034 


559 


1035 


523 


1036 


615 


1037 


543 


1038 


533 


1039 


584 


1040 
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YKKEGEQVEFSF 


548 


1041 


KKEGEQVEFSFP 


598 


1042 


KEGEQVEFSFPL 


710 


1043 


EGEQVEFSFPLA 


1456 


1044 


GEQVEFSFPLAF 


1729 


1045 


EQVEFSFPLAFT 


1556 


1046 


QVEFSFPLAFTV 


1636 


1047 


VEFSFPLAFTVE 


518 


1048 


EFSFPLAFTVEK 


585 


1049 


FSFPLAFTVEKL 


573 


1050 


SFPLAFTVEKLT 


528 


1051 


FPLAFTVEKLTG 


622 


1052 


PLAFTVEKLTGS 


528 


1053 


LAFTVEKLTGSG 


608 


1054 


AFTVEKLTGSGE 


511 


1055 


FTVEKLTGSGEL 


530 


1056 


TVEKLTGSGELW 


573 


1057 


VEKLTGSGELWW 


477 


1058 


EKLTGSGELWWQ 


543 


1059 


Empty 


571 





(control ) 



Panels F and G provide data on sequential alanine 
replacements for selected CD4 polypeptides. 

5 Panel F 



PEPTIDE 


COUNTS 


SEQ ID 






NO: 


Z Z Z Z Z ZDTY I CE VED 


5844 


1060 


ZZZZZZATYICEVED 


5921 


1061 


ZZZZZZDAYICEVED 


6362 


1062 


ZZZZZZDTAICEVED 


1301 


1063 


Z Z Z Z Z ZDTYACEVED 


2583 


1064 


ZZZZZZDTYIAEVED 


4483 


1065 


ZZZZZZDTYICAVED 


3154 


1066 


ZZZZZZDTYICEAED 


3432 


1067 


ZZZZZZDTYICEVAD 


3595 


1068 


ZZZZZZDTYICEVEA 


5942 


1069 


ZZZZZZDTYICEVED 


4973 


1070 


ZZZZZZDTYICEVED 


4775 


1070 


ZZZZZZATYICEVED 


4962 


1071 


ZZZZZZDAYICEVED 


4163 


1072 


ZZZZZZDTAICEVED 


1384 


1073 


Z Z Z Z Z ZDTYACEVED 


3085 


1074 


ZZZZZZDTYIAEVED 


5128 


1075 


ZZZZZZDTYICAVED 


2587 


1076 


ZZZZZZDTYICEAED 


2499 


1077 



65 



ZZZZZZDTYICEVAD 
ZZZZZZDTYICEVEA 
ZZZZZZDTYICEVED 
EEVQLLVFGLTANSD 
AEVQLLVFGLTANSD 
EAVQLLVFGLTANSD 
EEAQLLVFGLTANSD 
EEVALLVFGLTANSD 
EEVQALVFGLTANSD 
EEVQLAVFGLTANSD 
EEVQLLAFGLTANSD 
EEVQLLVAGLTANSD 
EEVQLLVFALTANSD 
EEVQLLVFGATANSD 
EEVQLLVFGLAANSD 
EEVQLLVFGLTTNSD 
EEVQLLVFGLTAASD 
EEVQLLVFGLTANAD 
EEVQLLVFGLTANSA 
EEVQLLVFGLTANSD 
EEVQLLVFGLTANSD 
AE VQLL VFGLT ANS D 
EAVQLLVFGLTANSD 
EEAQLLVFGLTANSD 
EEVALLVFGLTANSD 
EEVQALVFGLTANSD 
EEVQLAVFGLTANSD 
EEVQLLAFGLTANSD 
EEVQLLVAGLTANSD 
EEVQLLVFALTANSD 
EEVQLLVFGATANSD 
EEVQLLVFGLAANSD 
EEVQLLVFGLTTNSD 
EEVQLLVFGLTAASD 
EEVQLLVFGLTANAD 
EEVQLLVFGLTANSA 
EEVQLLVFGLTANSD 
THLLQGQSLTLTLES 
AHLLQGQSLTLTLES 
TALLQGQ S LTLTLE S 
THALQGQ S LTLTLE S 
THLAQGQ S LTLTLE S 
THLLAGQ S LTLTLE S 
THLLQ AQ SLTLTLES 
THLLQGAS LTLTLE S 
THLLQGQALTLTLES 
THLLQGQSATLTLES 
THLLQGQSLALTLES 



2706 


1078 


6345 


1079 


5564 


1080 


18582 


1081 


16220 


1082 


14220 


1083 


18124 


1084 


10890 


1085 


11258 


1086 


11954 


1087 


13317 


1088 


9573 


1089 


19348 


1090 


10408 


1091 


19973 


1092 


20100 


1093 


19390 


1094 


17684 


1095 


18227 


1096 


19738 


1097 


21338 


1098 


14590 


1099 


13213 


1100 


16296 


1101 


13415 


1102 


12603 


1103 


13690 


1104 


16286 


1105 


11480 


1106 


18254 


1107 


19978 


1108 


18863 


1109 


20021 


1110 


19200 


1111 


17928 


1112 


22206 


1113 


18721 


1114 


7756 


1115 


8602 


1116 


6931 


1117 


7683 


1118 


7701 


1119 


4578 


1120 


8471 


1121 


4238 


1122 


8659 


1123 


4430 


1124 


8158 


1125 
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THLLQGQSLTATLES 


4380 


1126 


THLLQGQSLTLALES 


11699 


1127 


THLLQGQSLTLTAES 


862 


1128 


THLLQGQSLTLTLAS 


2596 


1129 


THLLQGQSLTLTLEA 


5849 


1130 


THLLQGQ S LTLTLE S 


6545 


1131 


THLLQGQ S LTLTLE S 


4787 


1132 


AHLLQGQ S LTLTLE S 


5826 


1133 


TALLQGQSLTLTLES 


5012 


1134 


THALQGQS LTLTLE S 


5059 


1135 


THLAQGQS LTLTLE S 


5120 


1136 


THLLAGQSLTLTLES 


2956 


1137 


THLLQAQS LTLTLE S 


6393 


1137 


THLLQGASLTLTLES 


1933 


1139 


THLLQGQALTLTLES 


5151 


1140 


THLLQGQSATLTLES 


1391 


1141 


THLLQGQSLALTLES 


4749 


1142 


rpTTT T f~\/~ % /~\C'T r P7\ r PT "C> O 

1 HLLy CjQ b L 1 A 1 Lhj b 


813 


114 J 


THLLQGQSLTLALES 


8147 


1144 


THLLQGQ S LTLTAE S 


797 


1145 


THLLQGQ S LTLTLAS 


2193 


1146 


THLLOGO S LTLTLE A 


7984 


1147 


THLLQGQ S LTLTLE S 


5947 


1148 


Empty (control) 


569 




Panel G 






PEPTIDE 


COUNTS 


SEQ ID 






NO: 


GEQVE FS F PLAFTVE 


20691 


1149 


AEQVEFSFPLAFTVE 


18546 


1150 


GAQVE FS FPLAFTVE 


17733 


1151 


GEAVEFSFPLAFTVE 


17500 


1152 


GEQAEFSFPLAFTVE 


14764 


1153 


GEQVAF S F PLAFTVE 


16668 


1154 


GEQVEAS FPLAFTVE 


6793 


1155 


GEQVE FAF PLAFTVE 


21681 


1156 


GEQVEFSAPLAFTVE 


7767 


1157 


GEQVE F S FALAFT VE 


20480 


1158 


GEQVEFSFPAAFTVE 


10024 


1159 


GEQVEFSFPLTFTVE 


17397 


1160 


GEQVEFSFPLAATVE 


10130 


1161 


GEQVEFSFPLAFAVE 


20627 


1162 


GEQVEFSFPLAFTAE 


18797 


1163 


GEQVE F S F PLAFT VA 


18371 


1164 


GEQVEFS FPLAFTVE 


17662 


1165 


GEQVE FS F PLAFTVE 


19190 


1166 


AEQVEFSFPLAFTVE 


18042 


1167 



67 



GAQVE F S F PLAFTVE 
GEAVE F S F PLAFTVE 
GEQAEFSFPLAFTVE 
GEQVAFSFPLAFTVE 
GEQVEASFPLAFTVE 
GEQVEFAFPLAFTVE 
GEQVE FSAPLAFTVE 
GEQVE F S FALAFTVE 
GEQVEFSF PAAFTVE 
GEQVEFSFPLTFTVE 
GEQVEFSFPLAATVE 
GEQVEFSFPLAFAVE 
GEQVEFSFPLAFTAE 
GEQVE F S F PLAFTVA 
GEQVEFSFPLAFTVE 
ZZZZZZDTYICEVED 
ZZZZZZDTYICEVEZ 
ZZZZZZDTYICEVZZ 
ZZZZZZDTYICEZZZ 
ZZZZZZDTYIZZZZZ 
ZZZZZZZTYICEVED 
EEVQLLVFGLTANSD 
EEVQLLVFGLTANS Z 
EEVQLLVFGLTANZ Z 
EEVQLLVFGLTAZ Z Z 
EE VQLLVFGLTZ Z Z Z 
EE VQLLVFGLZ Z Z Z Z 
E EVQLLVFGZZ Z Z Z Z 
EEVQLLVFZZZZZZZ 
EEVQLLVZZZZZZZZ 
ZEVQLLVFGLTANSD 
Z Z VQLLVFGLTANSD 
Z Z ZQLLVFGLTANSD 
Z Z ZZLLVFGLTANSD 
ZZZZZLVFGLTANSD 
Z Z Z Z Z Z VFGLTANSD 
ZZZZZZZFGLTANSD 
ZZZZZZZZGLTANSD 
EEVQLLVFGLTANSD 
THLLQGQSLTLTLES 
THLLQGQSLTLTLEZ 
THLLQGQSLTLTLZZ 
THLLQGQSLTLTZZZ 
THLLQGQSLTLZZZZ 
THLLQGQ S LT Z Z Z Z Z 
THLLQGQSLZZZZZZ 
THLLQGQSZZZZZZZ 
THLLQGQZZZZZZZZ 



18079 


1168 


19756 


1169 


13000 


1170 


13930 


1171 


6533 


1172 


20072 


1173 


7378 


1174 


19480 


1175 


10589 


1176 


18318 


1177 


9572 


1178 


19516 


1179 


16765 


1180 


18187 


1181 


18219 


1182 


5017 


1183 


5421 


1184 


2166 


1185 


922 


1186 


564 


1187 


3031 


1188 


23357 


1189 


15808 


1190 


16496 


1191 


14097 


1192 


16473 


1193 


10516 


1194 


10372 


1195 


7333 


1196 


1098 


1197 


16716 


1198 


5281 


1199 


4310 


1200 


1026 


12 01 


664 


1202 


779 


1203 


760 


1204 


657 


1205 


18040 


1206 


10850 


1207 


10269 


1208 


4668 


1209 


908 


1210 


844 


1211 


475 


1212 


548 


1213 


570 


1214 


442 


1215 
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ZHLLQGQSLTLTLES 


11445 


1216 


ZZLLQGQSLTLTLES 


11631 


1217 


ZZZLQGQSLTLTLES 


7993 


1218 


Z Z Z ZQGQ S LTLTLE S 


6887 


1219 


ZZZZZGQSLTLTLES 


3305 


1220 


ZZZZZZQSLTLTLES 


4453 


1221 


ZZZZZZZSLTLTLES 


1086 


1222 


ZZZZZZZZLTLTLES 


1201 


1223 


THLLQGQS LTLTLE S 


9756 


1224 


GEQVEFSFPLAFTVE 


18856 


1225 


GEQVEFSFPLAFTVZ 


16222 


1226 


GEQVEFSFPLAFTZZ 


12535 


1227 


GEQVEFSFPLAFZZZ 


11384 


1228 


GEQVEFSFPLAZZZZ 


5846 


1229 


GEQVEFSFPLZZZZZ 


4749 


1230 


GEQVEFSFPZZZZZZ 


2208 


1231 


GEQVEFSFZZZZZZZ 


3277 


1232 


GEQVEFSZZZZZZZZ 


742 


1233 


ZEQVEFSFPLAFTVE 


19736 


1234 


ZZQVEFSFPLAFTVE 


18684 


1235 


ZZZVEFSFPLAFTVE 


12892 


1236 


ZZZZEFSFPLAFTVE 


12166 


1237 


ZZZZZFSFPLAFTVE 


2134 


1238 


ZZZZZZSFPLAFTVE 


1454 


1239 


ZZZZZZZFPLAFTVE 


1391 


1240 


ZZZZZZZZPLAFTVE 


1489 


1241 


GEQVEFSFPLAFTVE 


18867 


1242 


empty (control) 


580 





Example 11 

This example characterizes CD4 receptor sequences found to 
have HIV gpl20 binding activity in screening tests. Panel 
5 A displays information obtained from sequential replacement 
of amino acid residues by alaninyl residues. In panel A, a 
( + ) signifies a decrease in binding affinity whereas a (>) 
indicates that replacement of the residue by an alaninyl 
residue yields an increase in binding affinity. Sequences 
10 are shown with amino- terminus at the top and the carboxyl- 
terminus at the bottom. Right and left sides are from 
independent assays . 



Panel A. 



105-113 


116-130 


131-145 


216-229 



69 





E 


T 




T 


E 


H 


E 


++Y+ + 


v 


L 


o 


+ 1 + 


+0+ 


L 


+V+ 


c 


+L+ 


+0+ 


+E+ 


+E+ 


+L+ 


G 


+ + F+ + 


+V+ 


+V+ 


+0+ 


s 


+E+ 


+ F+ 


S 


++F+ + 


D 


G 


+L+ 


P 




+L . 


T 


++L+ + 




T 


+L+ + 


A 




A 


>T> 


+ + F+ + 




N 


+++L+++ 


T 




S 


++E+ + 


V 




D 


S 


E 



Panel B indicates the effect on binding affinity when 
successive amino acid residues are deleted, either from the 
5 amino- terminus (right side-symbols) or the carboxyl- 
terminus from the bottom (left side-symbol) . A (+) 
signifies a decrease in binding affinity, and the 
underlined residues indicate which residue was the last 
residue to be serially deleted. 

10 

Panel B. 



105-113 


116-130 


131-145 


216-229 


D+ 


E 


T 


G 


T 


E+ 


H 


E 


Y 


V+ 


L+ 


Q+ 


I 


Q+ + 


L+ 


V+ 


C 


L+ + + 


Q+ + 


E + + + 


+ + +E 


L+ + + 


G+ + 


F+ + + 


+ +V 


V+ + + 


Q+ + + 


S + + + + 


+E 


++++F++++ 


+++S+++ 


++++F++++ 


D 


+ +G 


+ ++L 


+ + + P 




+L 


+ ++T 


+++L 




T 


+ ++L 


+ +A 




A 


++T 


+ + F 




N 


++L 


+T 




S 


+E 


+V 




D 


S 


E 
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All publications cited herein are hereby incorporated 
by reference to the same extent as if each publication were 
individually and specifically indicated to be incorporated 
by reference and were set forth in its entirety herein. 
5 While this invention has been described with an 

emphasis upon preferred embodiments, it will be obvious to 
those of ordinary skill in the art that variations of the 
preferred embodiments can be used and that it is intended 
that the invention can be practiced otherwise than as 
10 specifically described herein. Accordingly, this invention 
includes all modifications encompassed within the spirit 
and scope of the invention as defined by the following 
claims . 



